Open jensenbox opened 3 months ago
Hi @jensenbox that looks quite interesting. I think we can fix the performance issues by working on the lexer parser to have low memory footprints. I've got some hints, but it's a matter of time. Did you try GreenMask? Are the performances much faster?
@evoxmusic As I understand their solution completely excludes SQL parsing bc their data payloads are coming from the Postgres COPY
command, meaning for a transformation it only needs to split the input string and the input value is ready to be deserialized and transformed.
Also they use a 3-step approach
It seems that while the backup file still contains the unsanitzed data, their process is significantly faster.
Any chance of adopting their methodology instead of the change the data while in flight? Theirs is to mutate the data once it lands in the destination database.