Open steff1088 opened 2 years ago
Hi,
I can't tell exactly since I don't have the command you used or the data, but the error message (found 446 reads when expected 223) suggests to me that the read sets are interleaved, since 223*2=446.
Does that help?
Thank you very much for the quick response.
The command I used was: _graftM graft --threads 8 --evalue 0.000000001 --forward 11774.2.218915.CGAACTG-ACAGTTC.filter-METAGENOME.fastq --graftm_package 500PSI_mcrAs_refined.gpkg --output_directory GraftM_output_11774.2.218915.CGAACTG-ACAGTTC_500PSI_mcrAs_refinedpackage --force
If the reads are interleaved, what can I do to make them compatible with the graft command?
Hi,
You can either use the --interleaved flag instead of --forward. You can tell whether they are interleaved easily just by looking at the head of the file - they'll have 2 reads with the same name. Alternatively you can split the file up - there's plenty of tools out there for doing that out there. ben Ben WoodcroftMicrobial informatics group leader, ARC Future Fellow (+617) 3443 7334 Centre for Microbiome Research, Level 3, Translational Research Institute, School of Biomedical Sciences, Faculty of Health, Queensland University of Technology https://research.qut.edu.au/cmr/team/ben-woodcroft
On Apr 27 2022, at 11:26 am, steff1088 @.***> wrote:
Thank you very much for the quick response. The command I used was: graftM graft --threads 8 --evalue 0.000000001 --forward 11774.2.218915.CGAACTG-ACAGTTC.filter-METAGENOME.fastq --graftm_package 500PSI_mcrAs_refined.gpkg --output_directory GraftM_output_11774.2.218915.CGAACTG-ACAGTTC_500PSI_mcrAs_refined_package --force
If the reads are interleaved, what can I do to make them compatible with the graft command? — Reply to this email directly, view it on GitHub (https://github.com/geronimp/graftM/issues/277#issuecomment-1110432361), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAADX5HD7CIFV4BJ6O7JZHLVHCJTPANCNFSM5UMM6IXA). You are receiving this because you were mentioned.
Thanks Ben, that did the trick!
-steffen
Hi all,
I ran into issues running my mcrA package on a big 45 GB metagenome in fastq format. I can't really interpret the error message so I was wondering if you had any ideas. The package runs fine on other metagenomes in fasta and fastq format. @wwood @geronimp
GraftM 0.13.1
04/23/2022 01:38:19 PM INFO: Working on 11774.2.218915.CGAACTG-ACAGTTC.filter-METAGENOME Traceback (most recent call last): File "/home/users/sbuessec/.local/bin/graftM", line 415, in
Run(args).main()
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/run.py", line 613, in main
self.graft()
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/run.py", line 388, in graft
diamond_db
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/timeit.py", line 10, in timed
result = method(*args, **kw)
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/sequence_searcher.py", line 851, in aa_db_search
hit_reads_orfs_fasta)
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/sequence_searcher.py", line 943, in search_and_extract_orfs_matching_protein_database
hits
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/graftm/sequence_searcher.py", line 534, in _extract_from_raw_reads
extern.run(extract_cmd, stdin='\n'.join(input_reads))
File "/home/users/sbuessec/.local/lib/python3.6/site-packages/extern/init.py", line 41, in run
raise ExternCalledProcessError(process, command)
extern.ExternCalledProcessError: Command mfqe --output-uncompressed --fasta-read-name-lists /dev/stdin --input-fasta <(awk '{print ">" substr($0,2);getline;print;getline;getline}' '11774.2.218915.CGAACTG-ACAGTTC.filter-METAGENOME.fastq') --output-fasta-files '/tmp/_raw_extracted_reads.famb1zbzrb' returned non-zero exit status 101.
STDERR was: b"[2022-04-23T20:45:46Z INFO mfqe] Read in 223 read names from /dev/stdin\n[2022-04-23T20:45:46Z INFO mfqe] Iterating input FASTQ file\n[2022-04-23T20:47:38Z INFO mfqe] Extracted 446 reads from 120829412 total\nthread 'main' panicked at 'Mismatching numbers of read names were observed. Expected:\n[223]\nbut found\n[446]', src/main.rs:333:9\nnote: run with
RUST_BACKTRACE=1
environment variable to display a backtrace\n"STDOUT was: b''