lisad / phaser

The missing layer for complex data batch integration pipelines
MIT License
9 stars 1 forks source link

ReshapePhase needs columns for parsing types #122

Closed lisad closed 5 months ago

lisad commented 6 months ago

We didn't put columns in ReshapePhase because they're not consistent, from beginning to end of a reshape phase they can be quite different e.g. in a pivot.

However, parsing types is necessary! E.g. in the boston pipeline, the pivot phase uses a DateColumn to parse date values.

We should add incoming column types that are checked and casted at the beginning of a ReshapePhase.

lisad commented 6 months ago

I have a whole new idea instead of ReshapePhase... since it is becoming more and more like phase. Maybe we make it a Phase but with a flag that says "renumber rows" ?