Open tomgallagher opened 2 years ago
Hi @tomgallagher , you could try the display
parameter to control what to show/compute. E.g., compute(df, display = ["Stats"])
to compute only stats. Or if you want to disable something you can use the config
, e.g., compute(df, cfg = {"hist.enable": False})
disable all hist. For more configurable parameters please refer to https://github.com/sfu-db/dataprep/blob/develop/docs/source/user_guide/eda/parameter_configurations.ipynb
Thank you for your reply.
On Sat, 25 Jun 2022, 14:36 Jinglin Peng, @.***> wrote:
Hi @tomgallagher https://github.com/tomgallagher , you could try the display parameter to control what to show/compute. E.g., compute(df, display = ["Stats"]) to compute only stats. Or if you want to disable something you can use the config, e.g., compute(df, cfg = {"hist.enable": False}) disable all hist. For more configurable parameters please refer to https://github.com/sfu-db/dataprep/blob/develop/docs/source/user_guide/eda/parameter_configurations.ipynb
— Reply to this email directly, view it on GitHub https://github.com/sfu-db/dataprep/issues/912#issuecomment-1166274808, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADH7WZZNVEDK4UVUCW5IQFLVQ34MPANCNFSM5ZNJEPDA . You are receiving this because you were mentioned.Message ID: @.***>
This looks like an extremely useful open source library. Congratulations!
I should be able to work this out from the API docs but the lack of examples mean I'm trying and erroring.
Can one limit the scope of the
compute
function to only provide, say, the stats output? In order to reduce the time taken for the calculations?More generally, do you have any advice on how to improve the performance of
dataprep
? I'm looking intodask
clusters but any other tips would be appreciated...Thanks