dssg / triage

General Purpose Risk Modeling and Prediction Toolkit for Policy and Social Good Problems
Other
182 stars 61 forks source link

Support (prefer) filepaths for cohort/label queries #883

Closed ecsalomon closed 2 years ago

ecsalomon commented 2 years ago

Maintaining a large project with multiple experiment files can become burdensome when the same cohort or label queries are used in multiple experiments. This is made easier by allowing users to include filenames in place of code in the config. Triage loads the query into the config before hashing components or the experiment to ensure that the query itself is versioned, rather than the filepath.

I haven't tried to update dirty duck but would be happy to add that to this PR.

Wasn't sure where to plop the new utility function because experiments doesn't have its own utils?

ecsalomon commented 2 years ago

do we not lint? i know black was controversial, but not even pep8?

ecsalomon commented 2 years ago

closes #880

ecsalomon commented 2 years ago

actually finished fiddling :)

shaycrk commented 2 years ago

Thanks @ecsalomon! Just took a pass through and made a couple inline comments, but I think it looks pretty good overall.

I don't think we've had a linter as far back as I remember. I'm actually not a huge fan of linters generally, but if everyone else feels strongly, I imagine we could add one so long as it's not too heavy-handed.

shaycrk commented 2 years ago

@ecsalomon -- I just merged #877, so we should be good to incorporate this branch as well when you have a chance to do the updates and resolve the merge conflicts. Thanks!

ecsalomon commented 2 years ago

Latest changes were written and tested in 3.10 in anticipation of #893

shaycrk commented 2 years ago

Thanks @ecsalomon -- generally looks good to me, but added a couple small notes. Also, seems like several .swp files got added to the commit here. Maybe those should be added to the .gitignore?

shaycrk commented 2 years ago

Looks good to me -- I'll go ahead and merge it!