webyrd / mediKanren

Proof-of-concept for reasoning over the SemMedDB knowledge base, using miniKanren + heuristics + indexing.
MIT License
323 stars 53 forks source link

building index with rust #22

Open pkpkpk opened 6 years ago

pkpkpk commented 6 years ago

So I've started porting code/csv-semmed-ordered-unique-enum.rkt to rust. It works on the sample_semmed.csv

On the semmedDB page I am seeing a ~2GB PREDICATION gzipped sql dump. Can you confirm this is the right data source?

If that is the case, the CSV file it produces is about 9.5GB. I think it would be simpler to just decompress and process the sql.gz directly as its really almost identical to csv anyway. I'd like to make the tool usable for everyone though and if you need explicit csv support that would be good to know