The in memory representation of the data following parse by BioJS are either in a dense matrix, or in a dict of keys style sparse representation. As the authors note, specialized methods will need to be created to handle large data efficiently, however the authors may wish to consider placing emphasis instead on specialized data structures such as compressed sparse row or column.
McDonald D and Bolyen E. Referee Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 1; referees: 1 approved with reservations]. F1000Research 2016, 5:2348 (doi: 10.5256/f1000research.10362.r16546)
This is a very good point. Right now we only use the original sparse or dense representation as it is defined for the biom version 1.0 json. But depending on the input data a lot of memory can be saved by using specialized data structures to internally store the biom object on parse. It can then be transformed back to the json representation when write is called.
As referees of our f1000 article note:
McDonald D and Bolyen E. Referee Report For: biojs-io-biom, a BioJS component for handling data in Biological Observation Matrix (BIOM) format [version 1; referees: 1 approved with reservations]. F1000Research 2016, 5:2348 (doi: 10.5256/f1000research.10362.r16546)
This is a very good point. Right now we only use the original sparse or dense representation as it is defined for the biom version 1.0 json. But depending on the input data a lot of memory can be saved by using specialized data structures to internally store the biom object on
parse
. It can then be transformed back to the json representation whenwrite
is called.