Closed yesh1m closed 1 month ago
How did you get the "collapsed.sorted.gff" from the SMRT Link, did you use the same reference?
They got the file from output directory of SMRTLink by using the same reference FASTA file from Ensembl. Also, they tried the analysis via command line started from hifi.bam but encountered same issue.
Have they checked their analysis status, like ram usage (htop), it looks like there are too many scaffolds and contigs to accomplish the analysis.
Yes, after certain JH#, the process disappears without any message. Do you think specifying the number of threads to use via the '-j' option might address this issue?
We can't help you with data processing. If you provide a small reproducible test case, we can have a look. Closing until reproducible case is available.
Sample Data : Kinnex Full-length RNA, "collapsed.sorted.gff" from SMRTLink v13.0 & command line
Reference Data : Chinese hamster (CriGri_1.0) ; downloaded from Ensembl genomic.modified.sorted.zip
Used command: pigeon classify -d classify --log-level TRACE --log-file classify.log collapsed.sorted.gff ./genomic.modified.sorted.gtf ./Cricetulus_griseus_crigri.CriGri_1.0.dna.toplevel.fa
Problem : The classification job does not produce final classification.txt with summary.txt and report.json - only tmp file with scaffold name (created until JH000801)![image](https://github.com/PacificBiosciences/pbbioconda/assets/166366130/3532cc68-cab8-453a-8912-fd3f7c48cbae)
The total transcript is 454,686 but the log stopped at 405,800.![image](https://github.com/PacificBiosciences/pbbioconda/assets/166366130/6bf660fd-292a-4b04-a192-f2dad07fb427)