dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

Could output files reflect polyploidy? #434

Open ThrsBT opened 3 years ago

ThrsBT commented 3 years ago

Hello,

at the moment I’m analysing paired-end gbs data of plum trees using ipyrad v.0.9.1-dev. While the pipeline is working well, there is still one problem I couldn’t solve when it comes to the output files (step 7).

As all of my organisms are hexaploid (allopolyploidy) I set the „max_alleles_consens“ parameter to „6", like it was recommended in the ipyrad documentation (parameters) and elsewhere in an issue-post. I hope that's correct so far or are there any more parameters in the params-file that should be changed due to the hexaploidy of my samples?

My final goal is to use the ipyrad output (XXX.str) in STRUCTURE. Therefore I would need a XXX.str file containing 6 alleles per sample. Like it’s mentioned in the documentation "If max_alleles_consens is set > 2 then more alleles are allowed, however, heterozygous base calls are still made under the assumption of diploidy i.e., hetero allele frequency=50%.“ the XXX.str-file I get from ipyrad includes only two alleles per sample.

Now I am wondering: Is it even possible to create a structure-file with 6 alleles per sample using ipyrad? And if not, do you know any procedure to create such a file from the ipyrad output?

Thank you in advance for any help and hints!

Best regards

isaacovercast commented 3 years ago

Hello, Unfortunately ipyrad support for polyploid data is... ad hoc and poor. The effect of the diploid basecalling is that at any given site, a maximum of two alleles will be retained. If there are more than 2 alleles at a site this information is lost. We have talked about implementing better support for polyploid data, but it's REALLY tricky. I don't have much experience with this kind of data, so i don't have many useful suggestions in terms of creating a structure file of the format you are after. Sorry I couldn't be of more help. -isaac

ThrsBT commented 3 years ago

Thanks for your answer!


Von: Isaac Overcast notifications@github.com Gesendet: Montag, 1. Februar 2021 18:52:11 An: dereneaton/ipyrad Cc: Puchner, Theresa Anna-Maria; Author Betreff: Re: [dereneaton/ipyrad] Could output files reflect polyploidy? (#434)

Hello, Unfortunately ipyrad support for polyploid data is... ad hoc and poor. The effect of the diploid basecalling is that at any given site, a maximum of two alleles will be retained. If there are more than 2 alleles at a site this information is lost. We have talked about implementing better support for polyploid data, but it's REALLY tricky. I don't have much experience with this kind of data, so i don't have many useful suggestions in terms of creating a structure file of the format you are after. Sorry I couldn't be of more help. -isaac

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dereneaton/ipyrad/issues/434#issuecomment-771037428, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ASV4FHLL4AXPXDW5MENYBTLS43S4XANCNFSM4W5DB24Q.