Tasks that take input can accept an in-memory list or a path to a file with some specified format with optional validation.
Motivation
Prevents users' datasets from having to fit in memory on gobbli's side (although we can't easily control how some models are reading it in, so this may not make a difference for those models). May save time in some cases, since we have to write data to disk anyway to make it available to model Docker containers, so if it was there to begin with, the initial read/write is somewhat unnecessary.
Additional Details
Ideally we'd modify the model wrappers where we control data input (ex. Transformer, MT-DNN) to optionally lazily load data from files to completely eliminate the possibility of exhausting RAM.
Feature
Tasks that take input can accept an in-memory list or a path to a file with some specified format with optional validation.
Motivation
Prevents users' datasets from having to fit in memory on gobbli's side (although we can't easily control how some models are reading it in, so this may not make a difference for those models). May save time in some cases, since we have to write data to disk anyway to make it available to model Docker containers, so if it was there to begin with, the initial read/write is somewhat unnecessary.
Additional Details
Ideally we'd modify the model wrappers where we control data input (ex. Transformer, MT-DNN) to optionally lazily load data from files to completely eliminate the possibility of exhausting RAM.