sfu-db / dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
http://dataprep.ai
MIT License
2.03k stars 204 forks source link

Limit scope of compute calculations #912

Open tomgallagher opened 2 years ago

tomgallagher commented 2 years ago

This looks like an extremely useful open source library. Congratulations!

I should be able to work this out from the API docs but the lack of examples mean I'm trying and erroring.

Can one limit the scope of the computefunction to only provide, say, the stats output? In order to reduce the time taken for the calculations?

More generally, do you have any advice on how to improve the performance of dataprep? I'm looking into daskclusters but any other tips would be appreciated...

Thanks

jinglinpeng commented 2 years ago

Hi @tomgallagher , you could try the display parameter to control what to show/compute. E.g., compute(df, display = ["Stats"]) to compute only stats. Or if you want to disable something you can use the config, e.g., compute(df, cfg = {"hist.enable": False}) disable all hist. For more configurable parameters please refer to https://github.com/sfu-db/dataprep/blob/develop/docs/source/user_guide/eda/parameter_configurations.ipynb

tomgallagher commented 2 years ago

Thank you for your reply.

On Sat, 25 Jun 2022, 14:36 Jinglin Peng, @.***> wrote:

Hi @tomgallagher https://github.com/tomgallagher , you could try the display parameter to control what to show/compute. E.g., compute(df, display = ["Stats"]) to compute only stats. Or if you want to disable something you can use the config, e.g., compute(df, cfg = {"hist.enable": False}) disable all hist. For more configurable parameters please refer to https://github.com/sfu-db/dataprep/blob/develop/docs/source/user_guide/eda/parameter_configurations.ipynb

— Reply to this email directly, view it on GitHub https://github.com/sfu-db/dataprep/issues/912#issuecomment-1166274808, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADH7WZZNVEDK4UVUCW5IQFLVQ34MPANCNFSM5ZNJEPDA . You are receiving this because you were mentioned.Message ID: @.***>