CDCgov / cfa-viral-lineage-model

Apache License 2.0
9 stars 0 forks source link

Automatic path creation #39

Closed thanasibakis closed 2 weeks ago

thanasibakis commented 3 weeks ago

If this is over-engineered, then I'm not particularly attached to it, but let's see what we think.

We have a lot of Path.mkdir sprinkled throughout the code, which makes sense, as we create a lot of exports in a nested file hierarchy. However, I've been bit a couple times now by forgetting or misconfiguring a mkdir, and I'd like to help us avoid future bugs.

What I've done is wrapped around the Path class to give us ValidPath, which essentially guarantees the existence of whichever directory you're trying to write to. (This is especially helpful when nesting downwards, since class(ValidPath(".") / "child") == ValidPath.)

For example, from our codebase,

Path(config["data"]["save_file"]["eval"]).parent.mkdir(
    parents=True, exist_ok=True
)

eval_df.write_csv(config["data"]["save_file"]["eval"])

becomes

eval_df.write_csv(ValidPath(config["data"]["save_file"]["eval"]))

and

forecast_dir = Path(config["forecasting"]["save_dir"])
forecast_dir.mkdir(exist_ok=True)

# ...

plot_dir = forecast_dir / ("convergence_" + model_name)
plot_dir.mkdir(exist_ok=True)

plot.save(plot_dir / (par + ".png"), verbose=False)

becomes

forecast_dir = ValidPath(config["forecasting"]["save_dir"])

# ...

plot.save(
    forecast_dir / ("convergence_" + model_name) / (par + ".png"),
    verbose=False
)