namebrandon / Sparkov_Data_Generation

Synthetic Credit Card Transaction Generator used in the Sparkov program.
MIT License
133 stars 62 forks source link

refactor for speed improvement ~12x for transaction generation + argparse #8

Closed streamnsight closed 2 years ago

streamnsight commented 2 years ago

A major refactor for speed improvement mostly on the transaction generation. Measured speed from ~160s for 10000 customers with adults_2550_female_rural profile to 13s. Also, refactor to use argparse for user input and validation.

things that made a difference for speed:

Tests modified to fit the new format, but insure no regression was introduced. I also tested final output setting random generator seeds and it was the same for both original and this version.

Refactor also avoids overwriting profile object keys which make the code very hard to test, and instead create a separate object property to store results of the profile weight computations.