Closed oushujun closed 3 years ago
Hi Rengang,
I encounter errors when specifying more threads than the number of input sequences. Can you help to take a look?
2020-10-19 04:38:27,617 -WARNING- exit code 1 for CMD 'hmmscan --notextw -E 0.01 --domE 0.01 --noali --domtblout ./tmp/chunk_aaseq.14.fasta.domtbl /opt/conda/lib/python3.7/site-packages/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm ./tmp/chunk_aaseq.14.fasta' 2020-10-19 04:38:27,617 -WARNING- STDOUT: b'' STDERR: b'\nError: Sequence file ./tmp/chunk_aaseq.14.fasta is empty or misformatted\n\n' 2020-10-19 04:38:27,617 -WARNING- exit code 1 for CMD 'hmmscan --notextw -E 0.01 --domE 0.01 --noali --domtblout ./tmp/chunk_aaseq.15.fasta.domtbl /opt/conda/lib/python3.7/site-packages/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm ./tmp/chunk_aaseq.15.fasta' 2020-10-19 04:38:27,617 -WARNING- STDOUT: b'' STDERR: b'\nError: Sequence file ./tmp/chunk_aaseq.15.fasta is empty or misformatted\n\n' 2020-10-19 04:38:27,618 -WARNING- exit code 1 for CMD 'hmmscan --notextw -E 0.01 --domE 0.01 --noali --domtblout ./tmp/chunk_aaseq.16.fasta.domtbl /opt/conda/lib/python3.7/site-packages/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm ./tmp/chunk_aaseq.16.fasta' 2020-10-19 04:38:27,618 -WARNING- STDOUT: b'' STDERR: b'\nError: Sequence file ./tmp/chunk_aaseq.16.fasta is empty or misformatted\n\n'
2020-10-19 04:38:27,617 -WARNING- exit code 1 for CMD 'hmmscan --notextw -E 0.01 --domE 0.01 --noali --domtblout ./tmp/chunk_aaseq.14.fasta.domtbl /opt/conda/lib/python3.7/site-packages/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm ./tmp/chunk_aaseq.14.fasta' 2020-10-19 04:38:27,617 -WARNING- STDOUT: b'' STDERR: b'\nError: Sequence file ./tmp/chunk_aaseq.14.fasta is empty or misformatted\n\n'
2020-10-19 04:38:27,617 -WARNING- exit code 1 for CMD 'hmmscan --notextw -E 0.01 --domE 0.01 --noali --domtblout ./tmp/chunk_aaseq.15.fasta.domtbl /opt/conda/lib/python3.7/site-packages/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm ./tmp/chunk_aaseq.15.fasta' 2020-10-19 04:38:27,617 -WARNING- STDOUT: b'' STDERR: b'\nError: Sequence file ./tmp/chunk_aaseq.15.fasta is empty or misformatted\n\n'
2020-10-19 04:38:27,618 -WARNING- exit code 1 for CMD 'hmmscan --notextw -E 0.01 --domE 0.01 --noali --domtblout ./tmp/chunk_aaseq.16.fasta.domtbl /opt/conda/lib/python3.7/site-packages/TEsorter/database/REXdb_protein_database_viridiplantae_v3.0_plus_metazoa_v3.hmm ./tmp/chunk_aaseq.16.fasta' 2020-10-19 04:38:27,618 -WARNING- STDOUT: b'' STDERR: b'\nError: Sequence file ./tmp/chunk_aaseq.16.fasta is empty or misformatted\n\n'
Best, Shujun
Yes, some fasta will be empty. I will fix it.
I filter out empty files by add chunk_files = [chunk_file for chunk_file in chunk_files if os.path.getsize(chunk_file)>0] in hmmscan_pp function in TEsorter.py.
chunk_files = [chunk_file for chunk_file in chunk_files if os.path.getsize(chunk_file)>0]
hmmscan_pp
TEsorter.py
Hi Rengang,
I encounter errors when specifying more threads than the number of input sequences. Can you help to take a look?
Best, Shujun