ECP-CANDLE / candle_lib

MIT License
1 stars 9 forks source link

Allow complete paths and not prepend CANDLE_DATA_DIR #14

Open rajeeja opened 1 year ago

rajeeja commented 1 year ago

Old ways of having a new config_file in the working directory is broken due to CDR: setup:


(Benchmarks) ➜  ~/rBenchmarks/Pilot1/Uno (AUCstd1) ✗
$ ls uno_auc_model.txt
uno_auc_model.txt

(Benchmarks) ➜  ~/rBenchmarks/Pilot1/Uno (AUCstd1) ✗
$ echo $CANDLE_DATA_DIR
/Users/mbook/Benchmarks/Data/Pilot1/

Error: Trying to run this:

python uno_baseline_keras2.py --config_file ./uno_auc_model.txt -e 3 --save_weights save/saved.model.weights.h5 --add_noise True


(Benchmarks) ➜  ~/rBenchmarks/Pilot1/Uno (AUCstd1) ✗
$ python uno_baseline_keras2.py --config_file ./uno_auc_model.txt -e 3 --save_weights save/saved.model.weights.h5 --add_noise True
...
  File "/opt/homebrew/anaconda3/envs/Benchmarks/lib/python3.10/site-packages/candle/parsing_utils.py", line 760, in finalize_parameters
    raise Exception(
Exception: ERROR ! Specified configuration file /Users/mbook/Benchmarks/Data/Pilot1/./uno_auc_model.txt not found ... Exiting
jmohdyusof commented 1 year ago

Uno (develop branch, no command line inputs) fails with:

Traceback (most recent call last): File "uno_baseline_keras2.py", line 690, in main() File "uno_baseline_keras2.py", line 686, in main run(params) File "uno_baseline_keras2.py", line 299, in run use_exported_data=datafile, UnboundLocalError: local variable 'datafile' referenced before assignment

rajeeja commented 1 year ago

Uno (develop branch, no command line inputs) fails with:

Traceback (most recent call last): File "uno_baseline_keras2.py", line 690, in main() File "uno_baseline_keras2.py", line 686, in main run(params) File "uno_baseline_keras2.py", line 299, in run use_exported_data=datafile, UnboundLocalError: local variable 'datafile' referenced before assignment

datafile logic is here: https://github.com/ECP-CANDLE/Benchmarks/blob/develop/Pilot1/Uno/uno_baseline_keras2.py#L271 Why is it not assigned until this point?

We can revert it back to use_exported_data=args.use_exported_data,

jmohdyusof commented 1 year ago

Yeah, I know where it is, and I can fix it by just assigning datafile=None prior to the existing logic, but since I didn't write the benchmark I would prefer someone fix it in a meaningful way.

jmohdyusof commented 1 year ago

In the short term, I can push a fix to candle_lib that will treat config_file paths starting with ./ specially. So e.g.

python uno_baseline_keras2.py --config_file ./uno_auc_model.txt

will work (find the local copy), but

python uno_baseline_keras2.py --config_file uno_auc_model.txt

will fail with:

Exception: ERROR ! Specified configuration file /vast/home/jamal/Code/ECP/CANDLE/Benchmarks/Data/uno_auc_model.txt not found ... Exiting

Is this OK for you?

rajeeja commented 1 year ago

In the short term, I can push a fix to candle_lib that will treat config_file paths starting with ./ specially. So e.g.

python uno_baseline_keras2.py --config_file ./uno_auc_model.txt

will work (find the local copy), but

python uno_baseline_keras2.py --config_file uno_auc_model.txt

will fail with:

Exception: ERROR ! Specified configuration file /vast/home/jamal/Code/ECP/CANDLE/Benchmarks/Data/uno_auc_model.txt not found ... Exiting

Is this OK for you?

That'd work. Thanks

jmohdyusof commented 1 year ago

Pushed to develop branch. We should discuss this and other path issues better when we have time.

rajeeja commented 1 year ago

@j-woz this use to work, now fails due to some CDD settings. https://jenkins-gce.cels.anl.gov/job/CANDLE_P1B1_torch_ckpt/173/console

jmohdyusof commented 1 year ago

This looks like an issue with the output dir locations which we implemented for the container compatibility. I think this is something we really need to discuss at a hackathon, since we can make it whatever we want, but there are conflicting needs of Benchmarks and IMPROVE use cases