Closed ps120195 closed 4 years ago
Hii,
Please find the attachments. I ran SARS-Cov2 sequencing pipeline for nanopore data, where I am getting two kinds of results. All the commands were same ,even the samples were same,but ran on different systems.
Can you tell why this is happening?
On Tue, Mar 31, 2020 at 2:05 AM Duncan MacCannell notifications@github.com wrote:
Was there a specific issue, or is this more of a philosophical conjecture?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CDCgov/SARS-CoV-2_Sequencing/issues/9#issuecomment-606234650, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO76WNPA2W4GCBJ6GVDB4YDRKD7AZANCNFSM4LW44E7Q .
Happy to help. Which pipeline? Attachments were missing.
I ran it thrice, still I am not getting details of vcf which is there in the image 1 ,saying the fasta sequence does not match the REF allele ... and so on
If these are two different systems, you're sure that the perl environment and all dependencies are the same version?
Does that make any difference like this?
On Tue 31 Mar, 2020, 2:31 AM Duncan MacCannell, notifications@github.com wrote:
If these are two different systems, you're sure that the perl environment and all dependencies are the same version?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CDCgov/SARS-CoV-2_Sequencing/issues/9#issuecomment-606247568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO76WNMY3FLJ2AUYYRWRNL3RKECDVANCNFSM4LW44E7Q .
All dependencies and perl environment is same for sure
On Tue 31 Mar, 2020, 2:34 AM priya singh, priya120195@gmail.com wrote:
Does that make any difference like this?
On Tue 31 Mar, 2020, 2:31 AM Duncan MacCannell, notifications@github.com wrote:
If these are two different systems, you're sure that the perl environment and all dependencies are the same version?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CDCgov/SARS-CoV-2_Sequencing/issues/9#issuecomment-606247568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO76WNMY3FLJ2AUYYRWRNL3RKECDVANCNFSM4LW44E7Q .
Also the dependencies were installed by pip ,so versions of dependencies are same in both systems. Please suggest why this is happening .what is the actual output we expect from this vcf_mask_lowcoverage.pl in terminal.
On Tue 31 Mar, 2020, 2:38 AM priya singh, priya120195@gmail.com wrote:
All dependencies and perl environment is same for sure
On Tue 31 Mar, 2020, 2:34 AM priya singh, priya120195@gmail.com wrote:
Does that make any difference like this?
On Tue 31 Mar, 2020, 2:31 AM Duncan MacCannell, notifications@github.com wrote:
If these are two different systems, you're sure that the perl environment and all dependencies are the same version?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CDCgov/SARS-CoV-2_Sequencing/issues/9#issuecomment-606247568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO76WNMY3FLJ2AUYYRWRNL3RKECDVANCNFSM4LW44E7Q .
I'm not clear on the difference between screen shots 1 and 2.
In screenshot 1, it looks like it finished correctly. Did you get a reasonable consensus in 'consensus.fasta'?
In screenshot 2, something went wrong. Are you using the same reference fasta that was used for read mapping? Bcftools is very picky about the vcf and the reference to which it applies variants. It may be possible that the reference was getting masked incorrectly, but I can't work out why that would be. I wonder if you could check the samtools depth
at position 8782 and potentially let me have a look at your vcf? Interestingly, position 8782 is one where we have observed a lot of variation.
Yes I am using the same reference that i used for mapping.
On Tue 31 Mar, 2020, 8:41 PM Clint, notifications@github.com wrote:
I'm not clear on the difference between screen shots 1 and 2.
In screenshot 1, it looks like it finished correctly. Did you get a reasonable consensus in 'consensus.fasta'?
In screenshot 2, something went wrong. Are you using the same reference fasta that was used for read mapping? Bcftools is very picky about the vcf and the reference to which it applies variants. It may be possible that the reference was getting masked incorrectly, but I can't work out why that would be. I wonder if you could check the samtools depth at position 8782 and potentially let me have a look at your vcf? Interestingly, position 8782 is one where we have observed a lot of variation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CDCgov/SARS-CoV-2_Sequencing/issues/9#issuecomment-606688385, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO76WNOFEKK6GI5QGSX5JRLRKIBZVANCNFSM4LW44E7Q .
consensus2.fasta is my consensus fasta and MN908947.3.fasta is my reference file which i used in mapping too.Also I am getting that C to T variant at same 8782 location
samtools depth at position 8782 is 1871
Here I ran from start till last,still result is same, please see the screenshot
Hmm. I'd like to get to the bottom this, but I need a little more info. Can you show me the output of the following:
bcftools view VIC07_ONT.vcf |grep -EC3 "\s8282\s"
bcftools view VIC07_ONT.vcf.masked.vcf.gz |grep -EC3 "\s8282\s"
It was -EC3 ,,sorry
Those look OK to me. The only other thing I can think of is that there is something funky going on with the reference. Can you try running dos2unix MN908947.3.fasta
and then running the script again? If that is the issue, I can make a change to fix this (I will add it in in any case).
Yaa sure ,
I tried dos2Unix command and ran the full script again, Still no change in output.
I tried now using MN908947.fna instead of MN908947.fasta ,and it worked. See the output
Ok, so it looks like you converted the line endings for "MN908947.fasta" and it worked. Using "MN908947.fna" (which is identical except line endings were not converted to unix line endings) trows the error. I think these are all consistent, unless I misunderstand you. I will make the change to take into consideration fasta files with Windows line endings.
Thank you for helping me out. I learnt alot during this error hunt.As the error is resolved ,I want to know if I have to use only file.fna for this pipeline ?
No worries! Glad you caught this, as it's an easy fix but annoying for users. The filename doesn't matter. As long as the fasta header is the same and (for now) the windows line endings of your reference file are converted to unix line endings.
@dmaccannell I think this can be closed
Was there a specific issue, or is this more of a philosophical conjecture?