gphocs-dev / G-PhoCS

G-PhoCS is a software package for inferring ancestral population sizes, population divergence times, and migration rates from individual genome sequences.
33 stars 4 forks source link

Generating the Sequence File from Multiple Consensus Genomes #51

Open FatihSarigol opened 5 years ago

FatihSarigol commented 5 years ago

Hi! This is not really an issue about the program, but instead of asking via email, I thought perhaps having it here might help someone else, too. I have a few whole genomes (which are in same coordinates with the variants applied to the reference, which consists of over 1000 contigs) from different individuals, and I want to run G-PhoCS on them. Could you suggest an easy way to generate a proper format input sequence file from them? I don't have much time to write a code to do that right now, but if nobody else has a similar code, I would be happy to share it here once I write it myself one day. Any help is much appreciated! Thanks

gphocs-dev commented 5 years ago

I heard that GLACtools has an option for generating G-PhoCS sequence input format. Can you please check and post here?

FatihSarigol commented 5 years ago

Thank you for your message! GLACtools has a script to export ACF files (which contains allele counts for either a single individual or a group of individuals (population)) as G-PhoCS format. I haven't tried it, yet, but it also can convert single sample VCF to ACF, so could be a way even though a long one seemingly. Any alternative ideas to go from fasta files with variants already applied? Thanks!

gphocs-dev commented 5 years ago

You'll probably have to write up a custom script for that. I typically end up using different custom scripts for different data sets.

grenaud commented 5 years ago

@FatihSarigol I am the author of glactools. I just saw this, converting VCF to ACF should be straightforward. The only problem is getting an outgroup or ancestral if need be. There is a perl script to convert contiguous chunks to gphocs output. Let me know if you run into any issues.

FatihSarigol commented 5 years ago

Hello @grenaud Thank you for your comment! I recently wrote my own script to merge and convert fasta files of different samples into the format that G-PhoCS requires.

If anyone else needs to take that road, too, I'd be happy to share my code; I'll eventually put it on my github, but can still make it better for other users.

grenaud commented 5 years ago

Yes glactools is not really designed for fasta files. It targets mostly genotyping or single bases from BAM files.

gphocs-dev commented 5 years ago

Thanks for both of your inputs. --Ilan

On Mon, Jun 17, 2019 at 7:50 PM FatihSarigol notifications@github.com wrote:

Hello @grenaud https://github.com/grenaud Thank you for your comment! I recently wrote my own script to merge and convert fasta files of different samples into the format that G-PhoCS requires.

If anyone else needs to take that road, too, I'd be happy to share my code; I'll eventually put it on my github, but can still make it better for other users.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/gphocs-dev/G-PhoCS/issues/51?email_source=notifications&email_token=ADO7ILXCN3ISPUHNGIEDOT3P266DPA5CNFSM4GFAYR52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX3ZAYQ#issuecomment-502763618, or mute the thread https://github.com/notifications/unsubscribe-auth/ADO7ILQPL5XK55SZTHLEUMLP266DPANCNFSM4GFAYR5Q .