cggh / scikit-allel

A Python package for exploring and analysing genetic variation data
MIT License
287 stars 49 forks source link

From HaploytpeArray to Fasta #357

Closed raveancic closed 3 years ago

raveancic commented 3 years ago

Dear all, thank you for this wonderful tool. It is a couple of days that I am trying to figure out how to write a FASTA from a VCF file (containing multiple haploid sequences). It seems that this tool has all the features to solve my problem.

However, I am finding difficult from the HaplotypeArray object to pass at the sequence of arrays (dtype= "|S1") needed for the allel.write_fasta() function. I tried different things and I thought that the map_alleles() function was the solution but it seems it works only when the dtype of the starting array and of the mapping object are the same.

Does anyone can help me in this conversion? Probably I am missing something. Thank you in advance, Alessandro

raveancic commented 3 years ago

I created a script exploiting scikit allel functions, I made a function that map the alleles to the numbers. Then the array can be saved as multi-FASTA, the function is available here. I am closing this for obvious reasons.