Use dask datasets for large datasets

cdisc-org / cdisc-rules-engine

Open source offering of the cdisc rules engine

MIT License

48 stars 13 forks source link

Use dask datasets for large datasets #537

Closed nhaydel closed 12 months ago

nhaydel commented 1 year ago

The engine needs a mechanism for determining it should use dask datasets or pandas datasets. It may be a good idea to create some sort of config variable so this can be modified by the user as well.

AC:

When working with large datasets (threshold to be determined in this issue) the engine should use dask. Otherwise it should use pandas.