Picrust2 pipeline problem

Lin1111111111 commented 2 years ago

Hi, I use PICRUSt2 pipeline this step "place_seqs.py -s ../dna-sequences.fasta -o out.tre -p 1 \ --intermediate intermediate/place_seqs " but it came out this. usage: place_seqs.py [-h] -s PATH [-r PATH] -o PATH [-p PROCESSES] [--intermediate PATH] [--min_align MIN_ALIGN] [--chunk_size CHUNK_SIZE] [--verbose] [-v] place_seqs.py: error: unrecognized arguments: dna-sequences.fasta

Thank you for any assistance you can provide,

sincerely,

gavinmdouglas commented 2 years ago

Hey @Lin1111111111,

That error means that the tool is interpreting dna-sequences.fasta as an argument flag, like -s or -o.

This usually happens because the hyphen characters aren't being interpreted correctly, which can happen sometimes if you copy commands from a website or are working with them in microsoft word. I would suggest you try re-typing the hyphen characters and see if it works.

Cheers,

Gavin

Lin1111111111 commented 2 years ago

Hi @gavinmdouglas, thank you for your advice. I tried but it still failed.

(picrust2) linziyin@linziyindeMacBook-Pro pic % place_seqs.py -s ../dan-sequence.fasta -o out.tre -p 1 --intermediate intermediate/place_seqs Error running this command: hmmalign --trim --dna --mapali /Users/linziyin/opt/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.fna.gz --informat FASTA -o intermediate/place_seqs/query_align.stockholm /Users/linziyin/opt/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.hmm ../dan-sequence.fasta

Standard error of the above failed command:

Error: Failed to open sequence file ../dan-sequence.fasta for reading

By the way, does this step have an influence on the problem?

"Place reads into reference tree The first step of PICRUSt2 is to insert your study ASVs into a reference tree (details). By default, this reference tree is based on 20,000 16S sequences from genomes in the Integrated Microbial Genomes database. The place_seqs.py script performs this step, which specifically: Aligns your study ASVs with a multiple-sequence alignment of reference 16S sequences with HMMER. Find the most likely placements of your study ASVs in the reference tree with EPA-NG. Output a treefile with the most likely placement for each ASV as the new tips with GAPPA."

Thank you sincerely, Vivi

Lin1111111111 commented 2 years ago

Hi @gavinmdouglas, I solve the first problem but there has a new problem that came out. When I tried the next step: (picrust2) linziyin@linziyindeMacBook-Pro pic % metagenome_pipeline.py -i feature-table.biom -m marker_predicted_and_nsti.tsv.gz -f EC_predicted.tsv.gz \ -o EC_metagenome_out --strat_out

1 of 1119 ASVs were above the max NSTI cut-off of 2.0 and were removed. Traceback (most recent call last): File "/Users/linziyin/opt/anaconda3/envs/picrust2/bin/metagenome_pipeline.py", line 122, in main() File "/Users/linziyin/opt/anaconda3/envs/picrust2/bin/metagenome_pipeline.py", line 104, in main skip_norm=args.skip_norm) File "/Users/linziyin/opt/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/metagenome_pipeline.py", line 66, in run_metagenome_pipeline pred_marker) File "/Users/linziyin/opt/anaconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/util.py", line 372, in three_df_index_overlap_sort "input files.") ValueError: No sequence ids overlap between all three of the input files.

Have you seen this situation before?

Thank you sincerely, Vivi

gavinmdouglas commented 2 years ago

Yes that's a common error, you can see more details about it here: https://github.com/picrust/picrust2/wiki/Frequently-Asked-Questions#how-do-i-troubleshoot-valueerror-no-sequence-ids-overlap-between-all-three-of-the-input-files

Does the info there help?

I would guess that there are likely some formatting differences in the ASV names in your BIOM table compared to the FASTA you input. Is this the case?

Lin1111111111 commented 2 years ago

Hi, thank you for your response. I tried the way but I am not sure my file is the right formatting.

JAkorli commented 2 years ago

Hello, I am hoping you can help me with this as well cos I have read through the discussions on this issue but it doesn't seem resolved. I have run into the same issue:

I run: place_seqs.py -s picrust/SILVA-rep-seqs-97.fna -o out.tre —p 1 -t sepp --intermediate place_seqs

and get this back: Error running this command: hmmalign --trim --dna --mapali /Users/jakorli/opt/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.fna --informat FASTA -o place_seqs/query_align.stockholm /Users/jakorli/opt/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.hmm picrust/SILVA-rep-seqs-97.fna

Standard error of the above failed command:

Error: Failed to open sequence file picrust/SILVA-rep-seqs-97.fna for reading

Any help? I am using a MacPro 2020 with Quad-Core Intel Core i7 and 16GB RAM.

Thanks

gavinmdouglas commented 2 years ago

Hi @JAkorli,

That error means that the input file cannot be found - can you confirm that it is in that location?

Cheers,

Gavin

JAkorli commented 2 years ago

@gavinmdouglas , I didn't realise that I was calling for the file from a non-existing folder. Thanks for drawing my attention. Command is working.

Cheers,

picrust / picrust2

Picrust2 pipeline problem #215