TeselaGen / fsml.org

A BioMADE Collaboration Project
https://fsml.org
13 stars 3 forks source link

Find or generate a large dataset to test CLI tool exporting #88

Closed tgadam closed 1 year ago

tgadam commented 2 years ago

Find or generate a large dataset to test CLI tool exporting. See https://github.com/TeselaGen/fsml.org/issues/74#issuecomment-1203097025

eabeliuk commented 2 years ago

We might use the Phycus randomized (and extended) dataset for this.

AndresPerezTesela commented 1 year ago

So far we've tested the CLI manifest generator with a 17,000 row data file. With this experiment, the following findings:

if the output file format is:

So essentially JSON is faster but its file size is larger, whereas YAML is slower but file size is smaller.

Bottom line, to attain unlimited scalability we might need to start exploring this other ticket: