dpuiu / MitoHPC

MIT License
10 stars 12 forks source link

Issues with chrMR.fa #5

Open munishika opened 1 year ago

munishika commented 1 year ago

Hi,

I am using MitoHPC, however, I am getting following error:

A USER ERROR has occurred: Fasta index file file:///scratch/prj/mito_als/MitoHPC/RefSeq/chrMR.fa.fai for reference file:///scratch/prj/mito_als/MitoHPC/RefSeq/chrMR.fa does not exist. Please see http://gatkforums.broadinstitute.org/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference for help creating it.

I have tried to build the .fai file, however, it doesn't work. Also the chrMR.fa file is empty.

Can you please help me to figure out the issue.

Thanks in advance, Munishika

dpuiu commented 1 year ago

Hi Munishika, sorry about that.

I have updated "scripts/circFasta.sh", the script which generates the "RefSeq/chrMR.fa" FASTA reference, along with the FASTA index file and now the GATK sequence dictionary.

You have to rerun "cd scripts/; . ./init.sh"

However "RefSeq/chrMR.fa" should not be empty.

Are you using the default HP_RNAME=hs38DH , HP_RMT=chrM , HP_MT=chrM variables?

munishika commented 1 year ago

Hi Daniela,

Thanks for your reply.

I did rerun "cd scripts/; . ./init.sh"

I am using the default variables.

The chrMR.fa is definitely empty. Its also not present in the Github Refseq section (https://github.com/dpuiu/MitoHPC/tree/main/RefSeq), whilst all the other files are.

Any ideas how I can resolves this ?

dpuiu commented 1 year ago

Sorry, I forgot ... you also have to re-run "./install_prerequisites.sh"

"install_prerequisites.sh" calls "circFasta.sh" and "rotateFasta.sh" which generate the chrMC. & chrMR. files

munishika commented 1 year ago

I did rerun "./install_prerequisites.sh" It creates chrMR file but it is empty:

chrM

chrMR.fa (END)

munishika commented 1 year ago

Hi Daniela,

Thanks for your help previously.

I am still stuck with the same issue. I have followed all the steps that you have mentioned, however, still no luck.

Looks like there is an issue with creating chrMR.fa and the files corresponding to it. eg. the .fai, .dict etc.

Is there something I can do to fix the issue? I have tried everything, I have also re-downloaded MitoHPC, just to be sure.

I really appreciate your help.

Best, Munishika

On Mon, Jan 9, 2023 at 4:00 PM Daniela Puiu @.***> wrote:

Sorry, I forgot ... you also have to re-run "./install_prerequisites.sh"

"install_prerequisites.sh" calls "circFasta.sh" and "rotateFasta.sh" which generate the chrMC. & chrMR. files

— Reply to this email directly, view it on GitHub https://github.com/dpuiu/MitoHPC/issues/5#issuecomment-1375855231, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR4SAM2VLKCU376UZF37VTLWRQYYHANCNFSM6AAAAAATVMGH3Q . You are receiving this because you authored the thread.Message ID: @.***>

dpuiu commented 1 year ago

Hi Munishika,

Could you try running "rotateFasta.sh" manually? This script has the -x flag set and will print out the commands and parameters.

cd scripts/ . ./init.sh ./rotateFasta.sh $HP_MT $HP_RDIR/$HP_MT $HP_E $HP_RDIR/$HP_MTR ls -l $HP_RDIR/$HP_MTR.*

Daniela Puiu

Bioinformatics Engineer

Department of Biomedical Engineering

Johns Hopkins University


From: munishika @.> Sent: Tuesday, January 10, 2023 6:39 AM To: dpuiu/MitoHPC @.> Cc: Daniela Puiu @.>; Comment @.> Subject: Re: [dpuiu/MitoHPC] Issues with chrMR.fa (Issue #5)

  External Email - Use Caution

Hi Daniela,

Thanks for your help previously.

I am still stuck with the same issue. I have followed all the steps that you have mentioned, however, still no luck.

Looks like there is an issue with creating chrMR.fa and the files corresponding to it. eg. the .fai, .dict etc.

Is there something I can do to fix the issue? I have tried everything, I have also re-downloaded MitoHPC, just to be sure.

I really appreciate your help.

Best, Munishika

Can you please

On Mon, Jan 9, 2023 at 4:00 PM Daniela Puiu @.***> wrote:

Sorry, I forgot ... you also have to re-run "./install_prerequisites.sh"

"install_prerequisites.sh" calls "circFasta.sh" and "rotateFasta.sh" which generate the chrMC. & chrMR. files

— Reply to this email directly, view it on GitHub https://github.com/dpuiu/MitoHPC/issues/5#issuecomment-1375855231, or unsubscribe https://github.com/notifications/unsubscribe-auth/AR4SAM2VLKCU376UZF37VTLWRQYYHANCNFSM6AAAAAATVMGH3Q . You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdpuiu%2FMitoHPC%2Fissues%2F5%23issuecomment-1377127336&data=05%7C01%7Cdpuiu%40jhu.edu%7C890efa6dfe1d48ec331008daf2ff5e6e%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638089475860282501%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OK2DKfGyZgt1WkEsOyjYuoqlQplhpqoKsIWtaDWftuo%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAHXHM45OPUBJDJUGV6MHJLWRVC75ANCNFSM6AAAAAATVMGH3Q&data=05%7C01%7Cdpuiu%40jhu.edu%7C890efa6dfe1d48ec331008daf2ff5e6e%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638089475860282501%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=igJvLtFrJDwfEjwoRVKuPaUDw5YiMWnn1nowLV33lzc%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

munishika commented 1 year ago

Hi Daniela,

Thanks for your reply.

I followed your instructions, still no luck. Please find the output below:

(base) k2142499@erc-hpc-login2:/scratch/prj/mito_als/MitoHPC/scripts$ ./rotateFasta.sh $HP_MT $HP_RDIR/$HP_MT $HP_E $HP_RDIR/$HP_MTR

Tool: bedtools getfasta (aka fastaFromBed) Version: v2.23.0 Summary: Extract DNA sequences into a fasta file based on feature coordinates.

Usage: bedtools getfasta [OPTIONS] -fi -bed <bed/gff/vcf> -fo

Options: -fi Input FASTA file -bed BED/GFF/VCF file of ranges to extract from -fi -fo Output file (can be FASTA or TAB-delimited) -name Use the name field for the FASTA header -split given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons) -tab Write output in TAB delimited format.

Any other suggestions??

dpuiu commented 1 year ago

Hi Munishika,

It looks like the problem is related to "bedtools getfasta" call.

"Bedtools" version 2.23 is relatively old (2015). The current version is 2.23.0 (2021)

In version 2.23, "-fo output_file" is required when using "bedtools getfasta" In our case that would be "-fo /dev/stdout" (the standard output) The later versions of "bedtools getfasta" use "-fo /dev/stdout" if no "-fo output_file" is specified. You can either edit "rotateFasta.sh" or update the "bedtools" I will update "rotateFasta.sh" and fix this issue.

Thanks for the feedback and let me know how it works or if you run into any other issues.

Daniela Puiu

Bioinformatics Engineer

Department of Biomedical Engineering

Johns Hopkins University


From: munishika @.> Sent: Wednesday, January 11, 2023 6:08 AM To: dpuiu/MitoHPC @.> Cc: Daniela Puiu @.>; Comment @.> Subject: Re: [dpuiu/MitoHPC] Issues with chrMR.fa (Issue #5)

  External Email - Use Caution

Hi Daniela,

Thanks for your reply.

I followed your instructions, still no luck. Please find the output below:

(base) @.***:/scratch/prj/mito_als/MitoHPC/scripts$ ./rotateFasta.sh $HP_MT $HP_RDIR/$HP_MT $HP_E $HP_RDIR/$HP_MTR

Tool: bedtools getfasta (aka fastaFromBed) Version: v2.23.0 Summary: Extract DNA sequences into a fasta file based on feature coordinates.

Usage: bedtools getfasta [OPTIONS] -fi -bed <bed/gff/vcf> -fo

Options: -fi Input FASTA file -bed BED/GFF/VCF file of ranges to extract from -fi -fo Output file (can be FASTA or TAB-delimited) -name Use the name field for the FASTA header -split given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons) -tab Write output in TAB delimited format.

-s Force strandedness. If the feature occupies the antisense, strand, the sequence will be reverse complemented.

-fullHeader Use full fasta header.

Any other suggestions??

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdpuiu%2FMitoHPC%2Fissues%2F5%23issuecomment-1378585028&data=05%7C01%7Cdpuiu%40jhu.edu%7C6c7d786e1c364eec609e08daf3c42203%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638090320970566522%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mqbkBquOo%2FoV2LBDF4fa5gHejA03TK3UzUqwfqqPA8k%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAHXHMZ5YHF6YO7EH53ICIDWR2IBRANCNFSM6AAAAAATVMGH3Q&data=05%7C01%7Cdpuiu%40jhu.edu%7C6c7d786e1c364eec609e08daf3c42203%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638090320970566522%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Nsoyc1uSNoImb1G7J9Dk3KSK71lIqnkpoCBx1mBks9Y%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

munishika commented 1 year ago

Hi Daniela,

Thanks for getting back to me. I used Bedtools version 2.23.0 (2021) and it did the trick.

Best, Munishika