psych-ds / psych-DS

Welcome to Psych-DS! If this is your first time visiting a Github repository, look to the left/down to the README (below the repository files.) Psych-DS is a specification for behavioral datasets - JSON-LD metadata, predictable directory structure, and machine-readable specifications for tabular datasets in behavioral research
Creative Commons Attribution 4.0 International
79 stars 6 forks source link

Set expectations about large datasets #41

Open mekline opened 3 months ago

mekline commented 3 months ago

TLDR

At some point, datasets become long enough that they cause high latency or even timeout on the webapp implementation.

We don't need to provide super duper duper duper performance (very large datasets probably want CLI validation for other reasons, maybe?) but we should give people an indication of what to expect.

Details

Once low-hanging fruit speed improvements are done, run some large datasets and add (somewhere on website, probably documentation?) some information like "Datasets with files larger than X may take Y amount of time", "Datasets with more than X files may take Z amount of time."