DEIB-GECO / GMQL

GMQL - GenoMetric Query Language
http://www.bioinformatics.deib.polimi.it/geco/
Apache License 2.0
18 stars 11 forks source link

update GDM Implementation #68

Closed akaitoua closed 6 years ago

akaitoua commented 7 years ago

GDM is based on fixed columns; chr start stop strand. The implementation always maintain those columns, Some data does not have strand, or the strand is unique for all the sample. Storing this strand is waste of memory. Other datasets has only a start with no stop (single base data), storing stop information is also a huge waste of memory.

GDM update should consider the above cases along with a dictionary implementation for the Chromosome column.

The aim of the update is to improve memory consumption and thus the over all system performance.