gisaia / ARLAS-proc

Workaround about data ingestion with computing frameworks
Apache License 2.0
4 stars 0 forks source link

Restructure a dataframe recursively #144

Closed laurent-thiebaud-gisaia closed 4 years ago

laurent-thiebaud-gisaia commented 4 years ago

It can basically rename columns, or put some columns in structures. It can also operate on columns objects.

laurent-thiebaud-gisaia commented 4 years ago

The goal of the PR is to ease dataframe restructuration before ES loading, because we gonna have a lot of new structure to create and, with the current groupColumnsInStructure() this becomes hardly readable

laurent-thiebaud-gisaia commented 4 years ago

@sfalquier I renamed as you suggested.

You also need previously:

On top of that, in both versions, what's happening if we name a structure with the name of an existing column of the dataframe or with the name of an element of the structure itself? These cases should be explicitly managed and tested since we are no longer able to avoid this problem with this new version of the code.

I added a test for the first case (we name a structure with the name of an existing column). I don't think we have to check if we name a structure with the name of an element of the structure itself (technically these will be different objects).

Waiting on your confirmation to definitively merge.