bio4j / dynamograph

GSoC 2014 project - a DynamoDB based graph DB
GNU Affero General Public License v3.0
4 stars 1 forks source link

Implement Parser for ncbiTaxonomy Dataset #23

Open alberskib opened 10 years ago

alberskib commented 10 years ago

Implement parser and integrate it with code for storing data in DynamoDb

alberskib commented 10 years ago

Files format description: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_readme.txt Data files: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz

Prior to https://github.com/bio4j/scala-model/blob/master/src/main/scala/bio4j/model/module/ncbiTaxonomy.scala ncbiTaxon have subrank but in linked files I do not see corresponding data Similar situation with scientific name - founded in names.dmp

alberskib commented 10 years ago

@bio4j/dynamograph Bump

laughedelic commented 10 years ago

what's the question? maybe this is useful: https://github.com/bio4j/titandb/blob/v0.3.1/src/main/java/com/ohnosequences/bio4j/titan/programs/ImportNCBITaxonomyTitan.java