Accelerate runtime when data set is large

almost-matching-exactly / DAME-FLAME-Python-Package

A Python Package providing two algorithms, DAME and FLAME, for fast and interpretable treatment-control matches of categorical data

https://almost-matching-exactly.github.io/DAME-FLAME-Python-Package/

MIT License

57 stars 14 forks source link

Accelerate runtime when data set is large #66

Open wtc100 opened 10 months ago

wtc100 commented 10 months ago

When data set size goes to millions of rows and hundreds of features, it takes hours to run. Could there be ways to shorten the computing time?

cynrudin commented 10 months ago

Perhaps you might try our other package https://github.com/almost-matching-exactly/variable_imp_matching ? This one might scale better. Or you could try our database option.

On Nov 8, 2023, at 10:22 AM, wtc100 @.***> wrote:

When data set size goes to millions of rows and hundreds of features, it takes hours to run. Could there be ways to shorten the computing time? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

nehargupta commented 9 months ago

I want to add the link to the database version (the flame_db folder): https://github.com/almost-matching-exactly/DAME-FLAME-Python-Package/tree/2d941bcfa76d7bcd33d58cbf4657202e62cc5b0c

and its documentation: https://github.com/almost-matching-exactly/DAME-FLAME-Python-Package?tab=readme-ov-file#a-tutorial-to-flame-database-version