Closed deleod closed 3 years ago
If you have a copy of your SNP data file in a VCF format, that can easily be accomplished with VCFtools or BCFtools. I'm not aware of anything that filters the immanc format used by BayesAss.
Ok, thanks. I do have a vcf file so I will look into VCFtools to filter and try re-running.
I was hoping to find a way to edit the structure file directly, so that I can then convert to immanc using the pyradStr2immanc.pl script.
Yeah, sorry, I'm not aware of anything that filters Structure files directly.
Apprecaite the tip! Will attempt to do this with the vcf file if needed. Not sure how straightforward it is to convert vcf to immanc though for Bayesass?
May have found a work around for filtering the structure file directly in R using the poppr package, missingno function. Will report back.
On Wed, May 19, 2021, 3:41 PM Steve Mussmann @.***> wrote:
Yeah, sorry, I'm not aware of anything that filters Structure files directly.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stevemussmann/BayesAss3-SNPs/issues/8#issuecomment-844413184, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIT7WWARS6DSAMFA6NTRNSTTOQH6JANCNFSM45FHFMLQ .
Usually I end up going through an intermediate format or two. There's a vcf2phylip converter (https://github.com/edgardomortiz/vcf2phylip), then I convert to structure (https://github.com/stevemussmann/file_converters/blob/master/phy2str.pl) and finally use the pyradStr2immanc.pl converter. It's far from the most efficient thing, but I'm not aware of any converters that would go direct from VCF to immanc. Maybe PGDSpider, but I've had mixed experiences with that program.
Thanks for the tip! I will check this out.
I was able to filter the structure file in R, after importing as a genind obj (adegenet package) using poppr/missingno, and export back to a structure file using an R function genind2structure written by Clark 2015 ( https://github.com/lvclark/R_genetics_conv/blob/master/genind2structure.R) though it did take a while to write the file. I then converted with your pyradStr2immanc.pl script instead of PGDSpider.
However, I wasn't able to get to successfully run with the filtered snp file. I am now getting a a new Segmentation Fault error, any ideas?
'Made new Indiv object Going to read input file Setting alleles pop1 0 Segmentation fault: 11'
On Wed, May 19, 2021 at 6:59 PM Steve Mussmann @.***> wrote:
Usually I end up going through an intermediate format or two. There's a vcf2phylip converter (https://github.com/edgardomortiz/vcf2phylip), then I convert to structure ( https://github.com/stevemussmann/file_converters/blob/master/phy2str.pl) and finally use the pyradStr2immanc.pl converter. It's far from the most efficient thing, but I'm not aware of any converters that would go direct from VCF to immanc. Maybe PGDSpider, but I've had mixed experiences with that program.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stevemussmann/BayesAss3-SNPs/issues/8#issuecomment-844552453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIT7WWA3J3G2Y4R2HEUO6PDTOQ7EBANCNFSM45FHFMLQ .
In my experience segmentation faults usually result from specifying an incorrect number of loci. If you converted to immanc with my perl script, you should be able to check the number of loci using awk '{print $3}' filename.txt | sort | uniq | wc -l
and replacing "filename.txt" with the name of your file.
Ahh, not sure how I missed this. Thank you!
On Thu, May 20, 2021 at 3:30 PM Steve Mussmann @.***> wrote:
In my experience segmentation faults usually result from specifying an incorrect number of loci. If you converted to immanc with my perl script, you should be able to check the number of loci using awk '{print $3}' filename.txt | sort | uniq | wc -l and replacing "filename.txt" with the name of your file.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stevemussmann/BayesAss3-SNPs/issues/8#issuecomment-845416319, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIT7WWEYVMNQDBWTZIOW4TTTOVPMRANCNFSM45FHFMLQ .
Hi, do you have any suggestions (or scripts) on how to remove loci where data is missing for all individuals. I am having an issue running the program with my input file and believe this is the culprit. `>BA3-SNPS -v -i100000000 -b1000000 -t -g -u -l 21292 -F wgenome_20_BA3.txt
Please cite: Wilson & Rannala (2003). Bayesian Inference of recent
migration rates using multilocus genotypes. Genetics 163:1177-1191.
Please also cite: Mussmann, Douglas, Chafin, & Douglas (2019). BA3- SNPs: Contemporary migration reconfigured in BayesAss for next-
generation sequence data. Methods in Ecology and Evolution.
Made new Indiv object Going to read input file Setting alleles pop_1 0 Read input file
At least one locus may contain no data for all samples in your input file. gsl: ../gsl/gsl_rng.h:200: ERROR: invalid n, either 0 or exceeds maximum value of generator Default GSL error handler invoked. Abort trap: 6 `