ZiyueYang01 / VirID

VirID: An integrated platform for the discovery and characterization of RNA Viruses
MIT License
11 stars 5 forks source link

Error deleting duplicate reads,b'' #9

Closed vinicius-santos-bmc closed 1 month ago

vinicius-santos-bmc commented 1 month ago

I had this error and I don't know exactly why. It seems like an unrecoverable error. So, it could be something new for the pipeline

VirID assembly_and_basic_annotation -i SRR10078305_1.fastq -i2 SRR10078305_2.fastq -out_dir output_virid  --threads 24
[2024-09-29 21:41:04] INFO: VirID v3.1.0
[2024-09-29 21:41:04] INFO: VirID assembly_and_basic_annotation -i SRR10078305_1.fastq -i2 SRR1
0078305_2.fastq -out_dir output_virid --threads 24 
[2024-09-29 21:41:04] TASK: START Primary Screen PART...
[2024-09-29 21:41:04] INFO: [assembly_and_basic_annotation] Quality control of sequencing data
[2024-09-29 21:42:23] ERROR: Controlled exit resulting from an unrecoverable error or warning.
================================================================================
EXCEPTION: ExternalException
  MESSAGE: Error deleting duplicate reads,b''
________________________________________________________________________________

Traceback (most recent call last):
  File "/home/vinisantos/anaconda3/envs/mamba/envs/virid/lib/python3.12/site-packages/VirID/__main__.py", line 55, in main
    gt_parser.parse_options(args)
  File "/home/vinisantos/anaconda3/envs/mamba/envs/virid/lib/python3.12/site-packages/VirID/main.py", line 155, in parse_options
    self.assembly_and_basic_annotation(options)
  File "/home/vinisantos/anaconda3/envs/mamba/envs/virid/lib/python3.12/site-packages/VirID/main.py", line 49, in assembly_and_basic_annotation  
    rm_rRNA_file,filename = assembly_and_basic_annotation_item.qc_with_rm_rRNA()
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vinisantos/anaconda3/envs/mamba/envs/virid/lib/python3.12/site-packages/VirID/assembly_and_basic_annotation.py", line 88, in qc_with_rm_rRNA
    after_qc_file = self._reads_qc_double(reads_1, reads_2)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vinisantos/anaconda3/envs/mamba/envs/virid/lib/python3.12/site-packages/VirID/assembly_and_basic_annotation.py", line 52, in _reads_qc_double
    rmdup_item.run(
  File "/home/vinisantos/anaconda3/envs/mamba/envs/virid/lib/python3.12/site-packages/VirID/external/fasta_process.py", line 52, in run
    raise ExternalException(f'Error deleting duplicate reads,{stderr}')
VirID.biolib.exceptions.ExternalException: Error deleting duplicate reads,b''
================================================================================

Head of the .fastq file:

@read2/lcl|NC_037644.1_cds_XP_026297620.1_11344 [gene=LOC551259] [db_xref=GeneID:551259] [prote
in=titin isoform X24] [protein_id=XP_026297620.1] [location=join(8853062..8853246,8853842..8854
034,8855144..8855257,8855892..8856066,8857503..8857603,8857711..8857920,8858548..8858733,885881
5..8858929,8859301..8859542,8859616..8859738,8859965..8860165,8860331..8860539,8864712..8865240
,8865770..8865954,8866054..8866207,8866309..8866534,8867006..8867253,8872232..8872429,8886869..
8886991,8887348..8887526,8888301..8888548,8889152..8889408,8889807..8889998,8890242..8890424,88
91397..8892005,8892076..8892252,8892353..8892536,8892615..8892742,8892889..8893338,8893963..889
4329,8894398..8894840,8896075..8896468,8896816..8897118,8897211..8897377,8898091..8898375,88987
83..8898969,8899381..8899595,8900162..8900446,8902315..8902489,8910140..8910483,8910938..891116
9,8911475..8911672,8911863..8912066,8912148..8912311,8912632..8915607,8916252..8916471,8917474.
.8918214,8919110..8919295,8919824..8920342,8921483..8921580,8921656..8921836,8922305..8922382,8
929877..8930072,8930165..8930349,8953624..8953666,8954027..8954301,8954776..8954993,8955408..89
55629,8956098..8956435,8956849..8957197,8957662..8957900,8958253..8958482,8958659..8958802,8958
973..8959230,8959350..8959493,8959757..8960029,8960132..8960677,8960913..8961390,8961988..89625
68,8962665..8963207,8963534..8963788,8964143..8964286,8965252..8965653,8966412..8966813,8967190
..8967412,8967508..8967680,8968399..8968538,8968716..8969232,8969400..8969543,8969624..8969878,
8970082..8972783,8972899..8973901,8974067..8974468,8974753..8975272,8975399..8975570,8975771..8
975957,8977982..8978321,8978548..8979274,8979617..8979955,8980031..8980187,8980684..8980850,898
1010..8981139,8995630..9003261,9003683..9004455,9004824..9005283,9005427..9005527,9005603..9005
697,9005956..9006176,9007079..9007153,9008818..9008922,9009052..9009240,9009363..9009588,900966
4..9009733,9009906..9010778,9011071..9011137,9011674..9011727,9014879..9025465,9026324..9035335
,9036626..9039379,9039708..9039733,9039826..9046768,9047204..9047251,9047355..9047468,9047566..
9047718,9047865..9048228,9048605..9048920,9049259..9049547,9050081..9050213,9050492..9050928,90
51110..9051379,9051798..9052176,9052641..9052795,9052922..9053212,9053351..9053492,9053666..905
3935,9054160..9054605,9054784..9055083,9055276..9055430,9055515..9055665,9056149..9056226,90563
20..9056424,9057208..9057405,9057577..9057882,9058141..9058343)] [gbkey=CDS];mate1:56228-56327;
mate2:56372-56470
GTTGCTTTCCACGTCGTTTGACAATCTTCTTCTTGGTGGTCTTCTTTGGTTTGCCTAATTTTGTTATTTCCTCTGTTACTGTTACCTGTTCCTTTACCTG
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
ZiyueYang01 commented 1 month ago

This error is usually a problem with the cd-hit-dup step. You can run cd-hit-dup -h to test if the installation was successful. If not, you need to compile the cd-hit source code or consider using one of our packages, see the Readme for details.

vinicius-santos-bmc commented 1 month ago

cd-hit-dup -h is working, but I installed it with conda. Am I supposed to compile the cd-hit source so it works properly?

vinicius-santos-bmc commented 1 month ago

I can't install cd-hit-dup with conda. It will only work using the source code. Thanks