Extractors need chunking ability

HealthRex / deployr-dev

Tools for creating cohorts, features, and models

MIT License

2 stars 4 forks source link

Extractors need chunking ability #5

Open conorkcorbin opened 1 year ago

conorkcorbin commented 1 year ago

STARR extractors take cohort tables and join them to specific tables within our STARR data extract on bigquery to create a timeline of features for each ML example.

When the number of rows in the provided cohort table is too large, bigquery complains.

TODO: implement chunking so that extractors can join chunks of a cohort table iteratively so that bigquery does not complain. Needed for all extractors.

conorkcorbin commented 1 year ago

@jyx-su did you end up implementing something like this for your project?