epi2me-labs / pychopper

cDNA read preprocessing
Other
61 stars 9 forks source link

Pychopper parameters and running time #48

Closed WQQIGDB closed 9 months ago

WQQIGDB commented 11 months ago

pychopper_jobE-File slurm-7043012 I used Pychopper to process the fastq file. Can you please help check whether these parameters for Pychopper are reasonable? Also, it spends a long time to run the Pychopper command. Is this situation normal spending so long time? I'll appreciate your kind reply!

nrhorner commented 10 months ago

Hi @WQQIGDB

Your parameters look OK. It looks like it is using too many reads for the initial parameter autotuning. It should use 10,000 by default, but I see in your case, it's using >4M. Could you try again with the addition of -Y 10000 to your command please. I will look to see if you have uncovered a bug.

WQQIGDB commented 10 months ago

Hi @WQQIGDB

Your parameters look OK. It looks like it is using too many reads for the initial parameter autotuning. It should use 10,000 by default, but I see in your case, it's using >4M. Could you try again with the addition of -Y 10000 to your command please. I will look to see if you have uncovered a bug.

Many thanks! I have tried to add '-Y 10000' to the command, but I didn't see many differences except that out-of-memory error. Another question is how can I determine the suitable value for '--cpus-per-task' for the pychopper sbatch job? Can you please make some comments for this issue? pychopper_job slurm-7242300.out.txt

nrhorner commented 10 months ago

Hi @WQQIGDB

I'm not familiar with slurm but can you run pychopper with more threads? maybe -t 20 or more? Then I assume you would need to set --cpus-per-task to 20 also. Please let me know how you get on.

WQQIGDB commented 10 months ago

Hi @WQQIGDB

I'm not familiar with slurm but can you run pychopper with more threads? maybe -t 20 or more? Then I assume you would need to set --cpus-per-task to 20 also. Please let me know how you get on.

Thanks for your reply! I tried with different modifications and finished the pychopper finally. Here are the parameters: --cpus-per-task=48 -t 48 -Y 10000 and -B 10000. Also I encountered one issue when analyzing transcripts with Pinfish (https://github.com/nanoporetech/pipeline-pinfish-analysis). I don't know whether the resulting transcripts in the "Merged_polished_transcripts_collapsed.gff" file are dependable? As indicated in the igv snapshot, one read (red arrow) in the reversed direction has almost the same exons with other reads. Is there one possibility that this read is not correctly oriented in pychopper? Another type of reads indicated by green arrow aslo have the reversed direction and are long reads. Could such reads be real full-length transcripts? Here is the workflow for analysis: After running "rule map_reads", I extracted the reads on specified chromosome and merge all the bam files from different samples, then run the following rules in snakefile. I modified the "minimum_cluster_size: 1" to preserve as many reads as possible. Hopefully you can provide some insights! igv_snapshot

nrhorner commented 10 months ago

Hi @WQQIGDB

I'm glad Pychoppe ris now running for you. I'm not familiar with pinfish so can't advise on that. That workflow is also deprecated, so you might want to give https://github.com/epi2me-labs/wf-transcriptomes a go.

nrhorner commented 9 months ago

Closing due to lack of response