CADWRDeltaModeling / dms_datastore

Data download and management tools for continuous data for Pandas. See documentation https://cadwrdeltamodeling.github.io/dms_datastore/
https://cadwrdeltamodeling.github.io/dms_datastore/
MIT License
1 stars 0 forks source link

auto_screen is very verbose and takes many hours to run #45

Closed dwr-psandhu closed 4 months ago

dwr-psandhu commented 4 months ago

auto screening prints out a lot to the console which makes it hard to use it. I suggest using logging and then controlling the amount of information being written to console.

It is also very slow, taking upto 18 hours to run. Not sure if this is because the algorithms are heavy or because of inefficient implentation.

water-e commented 4 months ago

I'm not sure very much of that is really helpful and much of it was development stuff. We can discuss the role of logging.


From: Nicky Sandhu @.> Sent: Wednesday, February 14, 2024 7:12 AM To: CADWRDeltaModeling/dms_datastore @.> Cc: Ateljevich, @. @.>; Assign @.***> Subject: [CADWRDeltaModeling/dms_datastore] auto_screen is very verbose and takes many hours to run (Issue #45)

auto screening prints out a lot to the console which makes it hard to use it. I suggest using logging and then controlling the amount of information being written to console.

It is also very slow, taking upto 18 hours to run. Not sure if this is because the algorithms are heavy or because of inefficient implentation.

— Reply to this email directly, view it on GitHubhttps://github.com/CADWRDeltaModeling/dms_datastore/issues/45, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AG2AJC6BGW6WMMBZ4VNHPV3YTTH5ZAVCNFSM6AAAAABDIOWPWKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZTINJYGA2TKOA. You are receiving this because you were assigned.Message ID: @.***>

dwr-psandhu commented 4 months ago

Reran with the latest build of env. Still running at 6.5 hours. Did your version run any faster?

water-e commented 4 months ago

I'll launch it now. I got 1:12 for the population and 3.something hours for reformatting.


From: Nicky Sandhu @.> Sent: Thursday, February 15, 2024 7:23 AM To: CADWRDeltaModeling/dms_datastore @.> Cc: Ateljevich, @. @.>; Assign @.***> Subject: Re: [CADWRDeltaModeling/dms_datastore] auto_screen is very verbose and takes many hours to run (Issue #45)

Reran with the latest build of env. Still running at 6.5 hours. Did your version run any faster?

— Reply to this email directly, view it on GitHubhttps://github.com/CADWRDeltaModeling/dms_datastore/issues/45#issuecomment-1946314881, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AG2AJC4F5VLTKBZCUYMO2DDYTYR6JAVCNFSM6AAAAABDIOWPWKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBWGMYTIOBYGE. You are receiving this because you were assigned.Message ID: @.***>

dwr-psandhu commented 4 months ago

after relaunching it yesterday with a rebuilt environment with latest dependencies, the raw and formatted took 2 hours, however the auto screening took 17 hours.

water-e commented 4 months ago

The only thing I want to add about downloading is I don't think you should draw conclusions that seperate launches are faster than multithreads. It is very hard to tell 1 hour from 2 on the downloads. Reformatting should have a stable speed from invocation to invocation and I can break it up a little more, but that should only favor speed on my machine not on a 4 or 8 proc system – for it is a traditional scalar debugging job. It used to take 90 minutes I'm sure it can again.


From: Nicky Sandhu @.> Sent: Friday, February 16, 2024 8:05 AM To: CADWRDeltaModeling/dms_datastore @.> Cc: Ateljevich, @. @.>; Assign @.***> Subject: Re: [CADWRDeltaModeling/dms_datastore] auto_screen is very verbose and takes many hours to run (Issue #45)

after relaunching it yesterday with a rebuilt environment with latest dependencies, the raw and formatted took 2 hours, however the auto screening took 17 hours.

— Reply to this email directly, view it on GitHubhttps://github.com/CADWRDeltaModeling/dms_datastore/issues/45#issuecomment-1948746682, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AG2AJCYFP6ISDDJD6LXGXI3YT57VNAVCNFSM6AAAAABDIOWPWKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBYG42DMNRYGI. You are receiving this because you were assigned.Message ID: @.***>