ericcapo / marky-coco

Ready-to-use pipeline to detect, count and identify the hgcAB genes, involved in mercury methylation, from metagenomes
9 stars 1 forks source link

Error while running script #4

Closed joafarcos closed 1 year ago

joafarcos commented 1 year ago

Hello,

I am trying to run your script to detected hgcAB genes in some Metagenomics samples but I am encountering an error when trying on your test sample. This is the error message:

    ==========     _____ _    _ ____  _____  ______          _____  
    =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
      =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
        ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
          ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
    ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
  v2.0.3
//========================== featureCounts setting ===========================\ Input files : 1 BAM file
MG01.bam
Output file : MG01_counts.tsv
Summary : MG01_counts.tsv.summary
Paired-end : no
Count read pairs : no
Annotation : MG01_genes.gff (GTF)
Dir for temp files : MG01_tmp
Threads : 1
Level : meta-feature level
Multimapping reads : not counted
Multi-overlapping reads : not counted
Min overlapping bases : 1

\============================================================================//

//================================= Running ==================================\ Load annotation file MG01_genes.gff ... Features : 386 Meta-features : 386 Chromosomes/contigs : 247
Process BAM file MG01.bam...

ERROR: Paired-end reads were detected in single-end read library : MG01_tmp/MG01.bam [Fri Jan 6 10:49:12 2023] Error in rule featureCounts: jobid: 1 input: MG01_tmp/MG01_genes.gff, MG01_tmp/MG01.bam output: MG01_tmp/MG01_counts.tsv shell: featureCounts -t CDS -o MG01_tmp/MG01_counts.tsv -g ID -a MG01_tmp/MG01_genes.gff MG01_tmp/MG01.bam (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-01-06T104845.028411.snakemake.log

Do you have any idea what is the problem? Thank you for you help.

best regards, Joana C.

ericcapo commented 1 year ago

Hi Joana, this is a bit unclear for me what happens. Did you start with paired-end fastq files for MG01? Can you please screenshot your inputs and outputs folder so maybe I can see where in the error. Thank you for using marky-coco :)

Cheers Eric

joafarcos commented 1 year ago

Hi Eric,

Thanks for you quick reply! I must warn you that I am an unexperienced user, but I am really interested in using your pipeline; and thank you for creating it !

Regarding the issue, I just ran your script, to test if the program was ok after the installation:

wget https://figshare.com/ndownloader/articles/19221213/versions/1 unzip 1 bash marky.sh MG01

and after a bit it gave me the error i showed you before.

Now, i was going to try to repeat the installation, because I had a power shut down when I was installing it, and, despite no error message was sent in the end, I though maybe something went wrong, and that was the source of the problem (maybe!)

Do you have any thoughts on this?

thank you so much. cheers, Joana.

joafarcos commented 1 year ago

Hello Eric,

So i reinstalled the program and tried again to run the script in test samples, but I had the same error. I also tried to run the script with my samples, but i get this error message:

(coco) ci@connect2oceans-fe:~/marky-coco$ bash marky.sh PE22_3 Building DAG of jobs... MissingInputException in rule fastp in file /home/ci/marky-coco/workflow/Snakefile, line 1: Missing input files for rule fastp: output: PE22_3_tmp/PE22_3_P1.fastq, PE22_3_tmp/PE22_3_P2.fastq, PE22_3_outputs/PE22_3_fastp.html, PE22_3_outputs/PE22_3_fastp.json wildcards: sample=PE22_3 affected files: PE22_3_1.fastq PE22_3_2.fastq Building DAG of jobs... MissingInputException in rule bt2build in file /home/ci/marky-coco/workflow/Snakefile, line 20: Missing input files for rule bt2build: output: PE22_3_tmp/PE22_3.index.1.bt2, PE22_3_tmp/PE22_3.index.2.bt2, PE22_3_tmp/PE22_3.index.3.bt2, PE22_3_tmp/PE22_3.index.4.bt2, PE22_3_tmp/PE22_3.index.rev.1.bt2, PE22_3_tmp/PE22_3.index.rev.2.bt2 wildcards: sample=PE22_3 affected files: PE22_3_tmp/PE22_3_megahit/final.contigs.fa Building DAG of jobs... MissingInputException in rule prodigal in file /home/ci/marky-coco/workflow/Snakefile, line 42: Missing input files for rule prodigal: output: PE22_3_tmp/PE22_3_genes.gff, PE22_3_tmp/PE22_3_proteins.faa wildcards: sample=PE22_3 affected files: PE22_3_tmp/PE22_3_megahit/final.contigs.fa My sample is PE22_3.fq.

Can you help in figuring out what the problem might be? I imagine something is not right with the sample file.

thanks again. best regards.

Joana.

ericcapo commented 1 year ago

Hi Joana,

The test sample are running fine with me so it maybe because of the versions of software installed.

About your metagenomes PE22_3, it look that this is a single end metagenome because your input file is PE22_3.fq. First, modify your extension as PE22_3.fastq

I have now added a script for single end metagenome and added a test metagenome too. I would advice you to delete the marky-coco folder from your computer/servor and re-install it (no need to reinstall the conda environment coco, this did not change). Please use both test samples (MG01 and MG02) and tell me if you still have errors. I hope this work with your metagenome now.

Hope this helps. Thanks for your feedbacks, it gave me the motivation to finallywrite the script for single end metagenomes ;)

Best, Eric

joafarcos commented 1 year ago

Thank you Eric,

How can I install the program without installing the environment?

Thanks.

A segunda, 9/01/2023, 17:48, Eric Capo @.***> escreveu:

Hi Joana,

The test sample are running fine with me so it maybe because of the versions of software installed.

About your metagenomes PE22_3, it look that this is a single end metagenome because your input file is PE22_3.fq. First, modify your extension as PE22_3.fastq

I have now added a script for single end metagenome and added a test metagenome too. I would advice you to delete the marky-coco folder from your computer/servor and re-install it (no need to reinstall the conda environment coco, this did not change). Please use both test samples (MG01 and MG02) and tell me if you still have errors. I hope this work with your metagenome now.

Hope this helps. Thanks for your feedbacks, it gave me the motivation to finallywrite the script for single end metagenomes ;)

Best, Eric

— Reply to this email directly, view it on GitHub https://github.com/ericcapo/marky-coco/issues/4#issuecomment-1376015853, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWMR5GO4YTSU6XEZTXUC5LDWRRFOZANCNFSM6AAAAAATS6VIAY . You are receiving this because you authored the thread.Message ID: @.***>

ericcapo commented 1 year ago

You just run git clone https://github.com/ericcapo/marky-coco.git cd marky-coco

The coco environment should be installed somewhere else on your computer/servor so if you run source conda_init.sh conda activate coco

you should see that is still activate even after deleting the folder

joafarcos commented 1 year ago

Hi Eric,

I just wanted to give you an update on how things are going.. So, as you suggested I re-installed the program, changed the file extension of my sample to .fastq, and the script for paired ended sequences is now running, and I believe it is all ok. (my sequences are paired-end, but I am glad that I gave you motivation to improve the script :) )

I have not tried the test sequence, but if it worked for you, should be ok :)

I wanted to ask you also (because I can see you are an expert in this methylation genes finding processes) if you can give me some insights on how to use this tool for the detection of Hgca and b genes in MAGs that were retrieved from my Metagenomic samples. I just need to run the hmmer against you hgmate database? I would be very grateful of your opinion on this.

Thank you again for your help and for the development of this fantastic tool! ..and fingers crossed for my sample:)

best regards, Joana.

ericcapo commented 1 year ago

Hi Joana, To look for hgcA genes in MAGs, you can use prodigal on each of your MAG -> prodigal -i bin001.fa -o bin001_genes.gff -f gff -a bin001_proteins.faa and screen the bin001_proteins.faa for hgcA genes using hmmer -> hmmsearch -o bin001_hgcA.txt --tblout bin001_hgcA_hmmer.out db/Hg-MATE-Db.v1/Hg-MATE-Db.v1.01142021_ISOCELMAG_HgcA.hmm bin001_proteins.faa

Hope this help! Best, Eric

joafarcos commented 1 year ago

HI Eric,

I had the same error that I had with the test sample(I think) with my Metagenomic sample. My output folder only contains 3 files: PE22_1_bowtie2.log PE22_1_fastp.html PE22_1_fastp.json

Captura de ecrã 2023-01-11, às 14 03 36

This is the error I got, while running Subread:

//================================= Running ==================================\ Load annotation file PE22_1_genes.gff ... Features : 4248798 Meta-features : 4248798 Chromosomes/contigs : 3029779
Process BAM file PE22_1.bam...

ERROR: Paired-end reads were detected in single-end read library : PE22_1_tmp/PE22_1.bam [Wed Jan 11 05:42:07 2023] Error in rule featureCounts: jobid: 1 input: PE22_1_tmp/PE22_1_genes.gff, PE22_1_tmp/PE22_1.bam output: PE22_1_tmp/PE22_1_counts.tsv shell: featureCounts -t CDS -o PE22_1_tmp/PE22_1_counts.tsv -g ID -a PE22_1_tmp/PE22_1_genes.gff PE22_1_tmp/PE22_1.bam (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-01-11T032054.670702.snakemake.log

Captura de ecrã 2023-01-11, às 14 06 03

Do you have an idea what might be the problem?

thanks for your help! best, Joana

ericcapo commented 1 year ago

Hi Joana, the error is that "paired-end reads were detected in single-end read library: PE22_1_tmp/PE22_1.bam but I have no clue why. Try the test files and see if this work. If not, that may come from how software were installed in the conda environment. If you want me to try an analysis on your metagenome, send links to eric.capo@hotmail.fr

joafarcos commented 1 year ago

Hi Eric,

I am trying to run the script on the test files, but the script for paired ended is not running ok.

Captura de ecrã 2023-01-12, às 11 24 24

As you can see, the download is ok, but when I try to run the script I get this file message of command not found:

Captura de ecrã 2023-01-12, às 11 17 44 Captura de ecrã 2023-01-12, às 11 19 10

Can you help?

Thanks :) Joana.

ericcapo commented 1 year ago

It probably doesn´t work because you did not activate the conda environment... conda activate coco

Eric

joafarcos commented 1 year ago

HI Eric

Sorry, you were right, I forgot to activate the environment. But I tried again after activating the environment and I got the same message:

Captura de ecrã 2023-01-12, às 12 23 24
ericcapo commented 1 year ago

Hi Joana, the error is still the same so I guess this is because of the version of subread (including featureCounts) that is installed in your computer/servor. I would try with version subread/1.5.2.

Alternately, I added a step-by-step tutorial for hgcA in marky-coco so maybe you can give a try and see where there is a issue in your code. Instead of activating the coco environment, try installing and activating each software individually. That could work. Hope this help.

Best, Eric

joafarcos commented 1 year ago

Hello Eric,

Thanks for taking time in helping me solving this issue

So, the problem now is that I am not sure how to run this new step by step tutorial you provided. I tried to copy/past the link to the navigation bar, but it does not open anything, How should I open this html file you provided?

Thanks again, Best regards,

Joana Costa Investigadora | Researcher

CIIMAR | Interdisciplinary Centre of Marine and Environmental Research of the University of Porto Terminal de Cruzeiros do Porto de Leixões Avenida General Norton de Matos, S/N 4450-208 Matosinhos | Portugal E-mail: @.*** Tel. (+351) 223 401 811



No dia 14/01/2023, às 07:47, Eric Capo @.***> escreveu:

Hi Joana, the error is still the same so I guess this is because of the version of subread (including featureCounts) that is installed in your computer/servor. I would try with version subread/1.5.2.

Alternately, I added a step-by-step tutorial for hgcA in marky-coco so maybe you can give a try and see where there is a issue in your code. Instead of activating the coco environment, try installing and activating each software individually. That could work. Hope this help.

Best, Eric

— Reply to this email directly, view it on GitHub https://github.com/ericcapo/marky-coco/issues/4#issuecomment-1382684831, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWMR5GLO66UUSXA5UO6G7PTWSJKZJANCNFSM6AAAAAATS6VIAY. You are receiving this because you authored the thread.

ericcapo commented 1 year ago

Hi Joana,

You need to download the file. If you do not know how to get it, redownload the whole marky-coco and put the .html file in your computer. Then if you double click on it, it will open on Firefox or Google Chrome. Hope this helps Cheers Eric

PS: otherwise, send an email to eric.capo@hotmail.fr and I send you the file

ericcapo commented 1 year ago

Message from Joana: "Thanks for all your help, and the pipeline is now working ok with test sample MG01, so it will probably work fine with my metagenomes too!

I did what you suggested, and reinstalled a different version of the subread (1.6.3), (mine was 2.0.3)."

I have now modified the conda environment that should install exact version of each software