IMPROVE_DATA_DIR and frm.create_outdir

rajeeja commented 5 months ago

All the 3 parts create some directories, for running HPO these have to be outside the current working structure as we'd be using containers. How do you specify them to use IMPROVE_DATA_DIR?

test_ml_data_dir = "./ml_data/gCSI-gCSI/split_0" model_dir = "./out_models/gCSI/split_0" infer_outdir = "./out_infer/gCSI-gCSI/split_0" Internal code uses: frm.create_outdir(outdir=params["model_outdir"])

wilke commented 5 months ago

@rajeeja Could you please elaborate? Why is the directory structure depending on running inside or outside a container?

rajeeja commented 5 months ago

@rajeeja Could you please elaborate? Why is the directory structure depending on running inside or outside a container?

Sure, in the .txt file the path specified is:

test_ml_data_dir = "./ml_data/gCSI-gCSI/split_0"

also, ./out_models is hard-coded when you call frm.create_outdir(outdir=params["model_outdir"]) from the your code. This will not work when you run it from the container. Please run some or any code that calls this line:

frm.create_outdir(outdir=params["model_outdir"])

adpartin commented 4 months ago

@rajeeja @wilke

All the 3 parts create some directories, for running HPO these have to be outside the current working structure as we'd be using containers. How do you specify them to use IMPROVE_DATA_DIR?

Yes. Each of the 3 scripts/steps creates a directory to save outputs from the script. The txt file specifies defaults, including the output dir for each step. The IMPROVE_DATA_DIR is currently designed to point to a benchmark data dir (for the CSA case, it is the csa_data). The idea is that benchmark data dir can be passed into a container via IMPROVE_DATA_DIR. As I understand, your question is: how do we get the results from outside the container? Is that right?

also, ./out_models is hard-coded when you call frm.create_outdir(outdir=params["model_outdir"]) from the your code. This will not work when you run it from the container. Please run some or any code that calls this line: frm.create_outdir(outdir=params["model_outdir"])

What do you mean by "hard-coded"? The param params["model_outdir"] determines the dir that will be created and save outputs from the *train_improve.py script. The param can be specified via command line arg or the txt file.

wilke commented 1 week ago

This is obsolete now.

JDACS4C-IMPROVE / IMPROVE

IMPROVE_DATA_DIR and frm.create_outdir #5