genomika / biopandas

Biopandas provides tools for the analysis and comprehension of high-throughput genomic data.
6 stars 0 forks source link

Pandas FASTA/FASTQ reader #1

Open luizirber opened 10 years ago

luizirber commented 10 years ago

This is one problem I stumbled some days ago. A specialized reader for FASTA/FASTQ files might be useful, instead of parsing through other ways and then loading into a DataFrame.

Possible problem is the lack of specifications for these file formats, but a good start is just reading sequence name and content from FASTA (and quality from FASTQ)

marcelcaraciolo commented 10 years ago

Thanks @luizirber for your feedback! I already used pandas to perform some data analysis with our genomical data specially bed files and the results were quite promising! I didn't find any pytthon package focused on genomical data analysis. Since I am a pandas huge fan and for us it's usual for us analysing our data, we decided to start a spin-off of pandas, called biopandas focused on biological data. I have some scripts here and I will start to commit them at this weekend. Your requirement is also mapped by us. Please bring more issues so we can discuss here! The project will be open-source and any suggestions or collaboration are welcome!

ChillarAnand commented 8 years ago

Any updates?