OasisLMF / OasisPlatformLot3

BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Investigate using non pandas dataframes in the existing oasis toolset #1

Open OmegaDroid opened 1 year ago

OmegaDroid commented 1 year ago

We need to ensure the existing toolset works correctly with non pandas datasets (eg spark and dask).

We will test with the existing OasisLMF test suite as well as PiWind to ensure consistent results.

OmegaDroid commented 1 year ago

Originally we tried to swap out the the usage of pandas with dask and spark, however there were significant changes to how the dataframes behaved that this seemed unreasonable.

To get around this we have implemented a wrapper around pandas which we will be able to extend for spark and dask etc. This wrapper will allow us to provide helper methods around problematic operations such as checking for blank values and string operations.