Qovery / Replibyte

Seed your development database with real data ⚡️
https://www.replibyte.com
GNU General Public License v3.0
4.17k stars 129 forks source link

Adopt the same proces as https://greenmask.io ? #302

Open jensenbox opened 4 months ago

jensenbox commented 4 months ago

It seems that while the backup file still contains the unsanitzed data, their process is significantly faster.

Any chance of adopting their methodology instead of the change the data while in flight? Theirs is to mutate the data once it lands in the destination database.

evoxmusic commented 1 month ago

Hi @jensenbox that looks quite interesting. I think we can fix the performance issues by working on the lexer parser to have low memory footprints. I've got some hints, but it's a matter of time. Did you try GreenMask? Are the performances much faster?

vchervanev commented 2 weeks ago

@evoxmusic As I understand their solution completely excludes SQL parsing bc their data payloads are coming from the Postgres COPY command, meaning for a transformation it only needs to split the input string and the input value is ready to be deserialized and transformed.

Also they use a 3-step approach