gisaia / ARLAS-proc

Workaround about data ingestion with computing frameworks
Apache License 2.0
4 stars 0 forks source link

Split DataModel into multiple configurations #89

Closed laurent-thiebaud-gisaia closed 5 years ago

laurent-thiebaud-gisaia commented 5 years ago

For better readability of the code and of the responsabilities of wrappers/transformers.

By the way I simplified the use of datamodel in the tests, sometime it was hard to know which one is used (the ArlasTest one or a test-specific one). As a convention, if a test needs a specific DataModel, it is named "testDataModel", and if it uses data from ArlasTest then testDataModel is based on the usual dataModel.

In WithArlasMovingState and WithArlasTempo I remove the partitionColumn argument and replaced it with dataModel.idColumn as these transformers are customer-specific and they don't need to change the partitionColumn. If some day they need to, it should rather be part of the Configuration.

I decided to keep the hmmWindowSize into the configuration and not in a static class, because I felt that the design was bad (should we change the static value for unit tests that needs different values?). Moreover, I am sure that someday the datascientists will need to change it and they won't understand if this is static. Feel free to criticize it if you prefer that static way, or any other.

laurent-thiebaud-gisaia commented 5 years ago

Rebasing