Open luizirber opened 10 years ago
Thanks @luizirber for your feedback! I already used pandas to perform some data analysis with our genomical data specially bed files and the results were quite promising! I didn't find any pytthon package focused on genomical data analysis. Since I am a pandas huge fan and for us it's usual for us analysing our data, we decided to start a spin-off of pandas, called biopandas focused on biological data. I have some scripts here and I will start to commit them at this weekend. Your requirement is also mapped by us. Please bring more issues so we can discuss here! The project will be open-source and any suggestions or collaboration are welcome!
Any updates?
This is one problem I stumbled some days ago. A specialized reader for FASTA/FASTQ files might be useful, instead of parsing through other ways and then loading into a DataFrame.
Possible problem is the lack of specifications for these file formats, but a good start is just reading sequence name and content from FASTA (and quality from FASTQ)