MaestSi / MetONTIIME

A Meta-barcoding pipeline for analysing ONT data in QIIME2 framework
GNU General Public License v3.0
78 stars 17 forks source link

Path and manifest file error? #50

Closed JamesN241 closed 2 years ago

JamesN241 commented 2 years ago

Hi,

I am having trouble running the pipeline. I have a fastq file that has already been basecalled, adaptors removed, filtered and ready to go.

I am running the pipeline as follows:

nohup ./MetONTIIME.sh -w /home/jamesnolan24/MetONTIIME/Site_D1/JN_SiteD1.fastq.gz -f /home/jamesnolan241/MetONTIIME/Site_D1/manifest.csv -s /home/jamesnolan241/MetONTIIME/seqdegapped.qza -t /home/jamesnolan241/MetONTIIME/taxonomy.qza -n 6 -c Blast -m >10 -q 0.9 -i 0.9 &

I keep getting this error among others:

realpath: /home/jamesnolan24/MetONTIIME/Site_D1/JN_SiteD1.fastq.gz: No such file or directory realpath: missing operand Try 'realpath --help' for more information. ./MetONTIIME.sh: line 86: /manifest.txt: Permission denied Usage: qiime tools import [OPTIONS]

I have tried to run the pipeline with just the fastq file itself and not zipped into a gz extension but no luck. Sorry if this is an obvious issue, I'm relatively new to the world of bioinformatics!

Thanks!

MaestSi commented 2 years ago

Ciao, for the -w input parameter, you should not provide a single fastq.gz file, but the directory where one or more fastq.gz files are. Moreover, keep in mind that the manifest and the metadata files are two different files, the former is used to tell the pipeline where the files corresponding to each sample are, while the latter is used to provide samples meta-data. Please delete metadata and manifest files created by previous runs, and try running: nohup ./MetONTIIME.sh -w /home/jamesnolan24/MetONTIIME/Site_D1/ -f /home/jamesnolan241/MetONTIIME/Site_D1/sample-metadata.tsv -s /home/jamesnolan241/MetONTIIME/seqdegapped.qza -t /home/jamesnolan241/MetONTIIME/taxonomy.qza -n 6 -c Blast -m 10 -q 0.9 -i 0.9 & P.s.: '> 10' is not a valid value for -m parameter, try with 10. Simone

JamesN241 commented 2 years ago

Hi Simone,

Thanks very much for the response. That has given me a nohup.out file which is great. How do I proceed with this OUT file? Also just to clarify do I need to provide a manifest file or is that created also? Sorry again for the spam.

Thanks, James.

MaestSi commented 2 years ago

Now the pipeline is running in background, and the standard error and standard output are written to nohup.out file, so that you can monitor the pipeline execution. The manifest file is created by the pipeline, you do not have to provide one. While the sample-metadata file may be created by you if you have many samples and interesting metadata which you would like to explore in statistical analyses. If this is not the case, the metadata file is created by the pipeline with minimal information. Best, Simone

JamesN241 commented 2 years ago

Thanks again Simone, great! It seems there is still an issue however as I just checked the nohup.out file and there is the same error. nohup.out.txt

MaestSi commented 2 years ago

Based on this error:

realpath: /home/jamesnolan24/MetONTIIME/Site_D1/: No such file or directory
realpath: missing operand

it looks like /home/jamesnolan24/MetONTIIME/Site_D1/ directory does not exist, can you check it exists and it contains only fastq.gz files? Simone

JamesN241 commented 2 years ago

Hi Simone, I believe it worked I didn't have a 1 after jamesnolan24. Sorry! I now have 3 qza files, rep-seqs_tmp.qza, sequences.qza and table_tmp.qza. So i think all is well, thanks so much!

One more question if that's ok. After I run that script there is a number [3] [13243] What is this? and also although it seems that it is finished running the __ continues to flash but not return to a new line? I must click enter again for it to return to the start of the terminal line. If that makes sense.

(MetONTIIME_env) jamesnolan241@DESKTOP-688RAN2:~/MetONTIIME$ nohup ./MetONTIIME.sh -w /home/jamesnolan241/MetONTIIME/Sit e_D1/ -f /home/jamesnolan241/MetONTIIME/Site_D1/sample-metadata.tsv -s /home/jamesnolan241/MetONTIIME/seqdegapped.qza -t /home/jamesnolan241/MetONTIIME/taxonomy.qza -n 8 -c Vsearch -m 3 -q 0.9 -i 0.9 & [3] 13243 (MetONTIIME_env) jamesnolan241@DESKTOP-688RAN2:~/MetONTIIME$ nohup: ignoring input and appending output to 'nohup.out'

MaestSi commented 2 years ago

I think the pipeline is still running, so let it run until you see /home/jamesnolan241/MetONTIIME/Site_D1/collapsed_feature_tables directory is created. You may monitor the processes running on your computer with top -u <username> or htop -u <username> commands, a part from reading the content of the nohup.out file. The number [3] means you have three processes running, while 13243 should be the main process ID associated with this run of the pipeline. After you run the command, the pipeline runs in background, so you can press enter and continue working in that terminal. If you prefer not to run the pipeline in background, just run: ./MetONTIIME.sh -w /home/jamesnolan241/MetONTIIME/Site_D1/ -f /home/jamesnolan241/MetONTIIME/Site_D1/sample-metadata.tsv -s /home/jamesnolan241/MetONTIIME/seqdegapped.qza -t /home/jamesnolan241/MetONTIIME/taxonomy.qza -n 6 -c Blast -m 10 -q 0.9 -i 0.9 but be sure not to close the terminal until it completes, and not to run multiple instances of the pipeline in parallel. SM

JamesN241 commented 2 years ago

Fantastic thanks so much for your help. That's worked now and I have all the associated files, I really appreciate the help.

MaestSi commented 2 years ago

Glad it worked! Best, Simone

JamesN241 commented 2 years ago

Hi Simone,

Sorry to bother you again, but I'm having an issue trying to run the Evaluate_diversity.sh script or the Evaluate_diversity_non_phylogenetic.sh script.

My line of code is:

(MetONTIIME_env) jamesnolan241@DESKTOP-688RAN2:~/MetONTIIME$ nohup ./Evaluate_diversity.sh -w /home/jamesnolan241/MetONTIIME/Site_D1 -d 50000 -m /home/jamesnolan241/MetONTIIME/Site_D1/sample-metadata.tsv -t 8 -c 1

The issue I'm getting is it doesn't seem to recognise the script file:

Working directory: /home/jamesnolan241/MetONTIIME/Site_D1 Sampling depth: 50000 reads Sample metadata: /home/jamesnolan241/MetONTIIME/Site_D1/sample-metadata.tsv Number of threads: 8 Clustering threshold: 1 ./Evaluate_diversity.sh: line 75: activate: No such file or directory Imported /home/jamesnolan241/MetONTIIME/Site_D1/manifest_50000_subsampled.txt as SingleEndFastqManifestPhred33V2 to /home/jamesnolan241/MetONTIIME/Site_D1/sequences_50000_subsampled.qza Saved FeatureTable[Frequency] to: /home/jamesnolan241/MetONTIIME/Site_D1/table_tmp_50000_subsampled.qza Saved FeatureData[Sequence] to: /home/jamesnolan241/MetONTIIME/Site_D1/rep-seqs_tmp_50000_subsampled.qza Saved FeatureTable[Frequency] to: /home/jamesnolan241/MetONTIIME/Site_D1/table_50000_subsampled.qza Saved FeatureData[Sequence] to: /home/jamesnolan241/MetONTIIME/Site_D1/rep-seqs_50000_subsampled.qza Plugin error from phylogeny:

Command '['mafft', '--preservecase', '--inputorder', '--thread', '8', '/tmp/qiime2-archive-yrhmve/900ab972-a522-411d-bb5f-d7e55e196fe0/data/dna-sequences.fasta']' returned non-zero exit status 1.

Debug info has been saved to /tmp/qiime2-q2cli-err-9rsotyw0.log Usage: qiime diversity core-metrics-phylogenetic [OPTIONS]

Sorry again, and thanks for the help. James.

MaestSi commented 2 years ago

Ciao, I'd say there are two possibilities. The first one is that you don't have any samples with at least 50k reads. Can you give the command a try with a lower number of reads (say 1000)? The second option is that you are running out of RAM memory. How much RAM do you have available on your system? P.s.: ./Evaluate_diversity.sh: line 75: activate: No such file or directory is not an error. That is a warning message due to the fact that you previously activated MetONTIIME conda environment. Best, Simone

JamesN241 commented 2 years ago

Thanks Simone,

I have 16gb of RAM with about 9gb available with Ubuntu. So that file is 53000 reads, but I did it with 1000 reads there which gave me the associated files so perhaps its the size.

Ah I see so its not an error. So should I not of activated the MetONTIIME environment? Do I need to activate a different environment?

Thanks again, James.

MaestSi commented 2 years ago

You don't really need to activate the MetONTIIME_env environment beforehand, but if you do, that is not an issue...you just get a warning telling you that it can't activate the environment (because it is already active). Regarding visualization of trees, I think QIIME2 does not have any built-in tree visualizer. However, I think you should try doing something like the following, starting from the output of Evaluate_diversity.sh with -c 1 parameter.

mkdir exported-tree
qiime tools export --input-path rooted-tree_"$SAMPLING_DEPTH"_subsampled.qza --output-path exported-tree
cp taxonomy.qza exported-tree

Upload the exported-tree/tree.nwk file in itol website using \<Choose file> selector. Go to Control panel -> Datasets -> Upload annotation files and select taxonomy.qza file. If you used a different value for -c parameter, to perform reads clustering, I think you should also reassign a taxonomy to each representative sequence of OTUs, with a command similar to this:

qiime feature-classifier classify-consensus-vsearch \
    --i-query rep-seqs_"$SAMPLING_DEPTH"_subsampled.qza"  \
    --i-reference-reads $DB \
    --i-reference-taxonomy $TAXONOMY \
    --p-perc-identity $ID_THR \
    --p-query-cov $QUERY_COV \
    --p-maxaccepts 100 \
    --p-maxrejects 100 \
    --p-maxhits $MAX_ACCEPTS \
    --p-strand 'both' \
    --p-unassignable-label 'Unassigned' \
    --p-threads $THREADS \
    --o-classification taxonomy_clustered_reads.qza

and upload taxonomy_clustered_reads.qza as an annotation file. Hope this helps, Simone

JamesN241 commented 2 years ago

Hi Simone,

That's worked perfectly, I really appreciate all the help! That's helped a lot with the reads now, very easy to visualise it now.

Thanks so much, James.