Open lyisrae1 opened 1 month ago
It looks the documentation needs to be fixed but is mostly an issue with file paths
At the start it says to download metagenome.fna.gz
to $HOME/tutorial/test_data/
but later the file has a different name $HOME/tutorial/test_data/78mbp_metagenome.fna
So to start you should download the metagenome.fna.gz
and save it to/as $HOME/tutorial/test_data/78mbp_metagenome.fna
There is a separate issue in the ORF creation step.
Current:
autometa-orfs \
--assembly $HOME/tutorial/78mbp_metagenome.filtered.fna \
--output-nucls $HOME/tutorial/78mbp_metagenome.orfs.fna \
--output-prots $HOME/tutorial/a78mbp_metagenome.orfs.faa \
--cpus 40
Should be:
autometa-orfs \
--assembly $HOME/tutorial/78mbp_metagenome.filtered.fna \
--output-nucls $HOME/tutorial/78mbp_metagenome.orfs.fna \
--output-prots $HOME/tutorial/78mbp_metagenome.orfs.faa \
--cpus 40
That should fix the error.
CC- @shaneroesemann @jason-c-kwan , the documentation needs to be updated accordingly. Also the error message generated by autometa-markers is not helpful, the subprocess stderr should be captured and printed rather than just saying there's an error with hmmpress and "Make sure your hmm profiles are pressed! "
which wasn't the issue
Tasks/Command(s)
Log/Error information generated by Autometa.
Hello, I appreciate you looking at my inquiry. I noticed that there was a step missing in your ReadTheDocs page for the tutorial. There is not step given to show us how to create hmmscan.tsv files before we need them to complete Step 4 - Single Copy Markers. For example, I followed the tutorial exactly, but I keep getting an error telling me that the hmmscan.tsv file does not exist. I will past the directions for Step 4 here: # Create a markers directory to hold the marker genes mkdir -p $HOME/Autometa/autometa/databases/markers # Change the default download path to the directory created above autometa-config \ --section databases \ --option markers \ --value $HOME/Autometa/autometa/databases/markers # Download single-copy marker genes autometa-update-databases --update-markers # hmmpress the marker genes hmmpress -f $HOME/Autometa/autometa/databases/markers/bacteria.single_copy.hmm hmmpress -f $HOME/Autometa/autometa/databases/markers/archaea.single_copy.hmm autometa-markers \ --orfs $HOME/tutorial/78mbp_metagenome.orfs.faa \ --kingdom bacteria \ --hmmscan $HOME/tutorial/78mbp_metagenome.hmmscan.tsv \ --out $HOME/tutorial/78mbp_metagenome.markers.tsv \ --parallel \ --cpus 4 \ --seed 42 When I follow this code, I get this error: ERROR: [10/23/2024 04:39:10 PM DEBUG] autometa.common.external.hmmscan: hmmscan --seed 42 --cpu 0 --tblout /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.hmmscan.tsv /vast/agnanad1/Leone/autometa_tutorial/markers/bacteria.single_copy.hmm /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.orfs.faa [10/23/2024 04:39:10 PM WARNING] autometa.common.external.hmmscan: Make sure your hmm profiles are pressed! hmmpress -f /vast/agnanad1/Leone/autometa_tutorial/markers/bacteria.single_copy.hmm Traceback (most recent call last): File "/home/lyisrae1/.conda/envs/autometa/bin/autometa-markers", line 10, in
sys.exit(main())
^^^^^^
File "/home/lyisrae1/.conda/envs/autometa/lib/python3.12/site-packages/autometa/common/markers.py", line 266, in main
get(
File "/home/lyisrae1/.conda/envs/autometa/lib/python3.12/site-packages/autometa/common/markers.py", line 162, in get
scans = hmmscan.run(
^^^^^^^^^^^^
File "/home/lyisrae1/.conda/envs/autometa/lib/python3.12/site-packages/autometa/common/external/hmmscan.py", line 174, in run
annotate_sequential(
File "/home/lyisrae1/.conda/envs/autometa/lib/python3.12/site-packages/autometa/common/external/hmmscan.py", line 106, in annotate_sequential
raise err
File "/home/lyisrae1/.conda/envs/autometa/lib/python3.12/site-packages/autometa/common/external/hmmscan.py", line 101, in annotate_sequential
subprocess.run(
File "/home/lyisrae1/.conda/envs/autometa/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['hmmscan', '--seed', '42', '--cpu', '0', '--tblout', '/vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.hmmscan.tsv',
Additionally, I've have had a few syntax issues in Step 5 - Taxonomy. But those were very easy to fix, so that is not the issue. But can I please get some clarification to finish out Step 4 on ReadTheDocs please? I cannot finish the tutorial properly without that step.
[autometa_tutorial.txt](https://github.com/user-attachments/files/17498392/autometa_tutorial.txt)
Here is the process I did without the markers:
autometa-binning \
--kmers /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.bacteria.kmers.embedded.tsv \
--coverages /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.coverages.tsv \
--gc-content /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.gc_content.tsv \
--output-binning /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.binning.tsv \
--output-main /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.main.tsv \
--clustering-method dbscan \
--completeness 20 \
--purity 90 \
--cov-stddev-limit 25 \
--gc-stddev-limit 5 \
--taxonomy /vast/agnanad1/Leone/autometa_tutorial/78mbp_metagenome.taxonomy.tsv \
--starting-rank superkingdom \
--rank-filter superkingdom \
--rank-name-filter bacteria
And here is the error message:
usage: autometa-binning [-h] --kmers filepath --coverages filepath
--gc-content filepath --markers filepath
--output-binning filepath [--output-main filepath]
[--clustering-method {dbscan,hdbscan}]
[--completeness 0 < float <= 100]
[--purity 0 < float <= 100] [--cov-stddev-limit float]
[--gc-stddev-limit float] [--taxonomy filepath]
[--starting-rank {superkingdom,phylum,class,order,family,genus,species}]
[--reverse-ranks]
[--rank-filter {superkingdom,phylum,class,order,family,genus,species}]
[--rank-name-filter RANK_NAME_FILTER] [--verbose]
[--cpus int]
autometa-binning: error: the following arguments are required: --markers
https://autometa.readthedocs.io/en/latest/bash-step-by-step-tutorial.html#single-copy-markers
Thank you for your time,
Leone