CFSAN-Biostatistics / snp-pipeline

SNP Pipeline is a pipeline for the production of SNP matrices from sequence data used in the phylogenetic analysis of pathogenic organisms sequenced from samples of interest to food safety.
Other
57 stars 31 forks source link

BACTERIA ORGANISM WITH TWO CHROMOSOMES #20

Closed vappiah closed 3 years ago

vappiah commented 3 years ago

Hello CFSAN developers,

I like to use CFSAN to identify snps in _Vibrio _cholerae__ isolates. But the each isolate has two chromosomes, meaning to reference sequences. Is it possible to run CFSAN by specifying two reference genomes ? Please advice and thanks in advance.

stevendavis commented 3 years ago

Your use-case is not something we have attempted. The CFSAN snp pipeline can only accept a single reference file. The reference can have multiple contigs. You might consider combining the two chromosomes into a single fasta file. YMMV.

vappiah commented 3 years ago

Thanks @stevendavis

Another strategy I want to use is to call snps using chromosome1 and 2 in seperate files.

hughrandFDA commented 3 years ago

Concatenating the two chromosomes should work reasonably well. I don’t have experience with Vibrio, but my understanding is that it’s genome is fairly dynamic - that may cause you more trouble than having to deal with two chromosomes. And keep in mind that reference-based analyses are best suited for isolates that are fairly close evolutionarily.

Hugh

From: Steve Davis @.> Sent: Tuesday, March 16, 2021 11:11 AM To: CFSAN-Biostatistics/snp-pipeline @.> Cc: Subscribed @.***> Subject: [EXTERNAL] Re: [CFSAN-Biostatistics/snp-pipeline] BACTERIA ORGANISM WITH TWO CHROMOSOMES (#20)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Your use-case is not something we have attempted. The CFSAN snp pipeline can only accept a single reference file. The reference can have multiple contigs. You might consider combining the two chromosomes into a single fasta file. YMMV.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/CFSAN-Biostatistics/snp-pipeline/issues/20#issuecomment-800347604, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2BAG5O3ODFK7JLBA3CIH3TD5YJRANCNFSM4ZIOD7OQ.

vappiah commented 3 years ago

Thanks everyone for your comments.