JDACS4C-IMPROVE / Singularity

Singularity definitions that can be extended to support execution of community models.
MIT License
3 stars 5 forks source link

Test TGSA #52

Closed wilke closed 1 year ago

wilke commented 1 year ago

Command:

  1. train.sh 1 ./tmp
  2. singularity run --bindpwd/tests:/candle_data_dir --nv build/TGSA.sif train.sh 4 /candle_data_dir --epochs 1

Status: In progress

Depends on:

Output:

1:

  File "/homes/wilke/miniconda3/envs/TGSA/lib/python3.6/site-packages/torch_geometric/data/data.py", line 7, in <module>
    from torch_sparse import coalesce, SparseTensor
  File "/homes/wilke/miniconda3/envs/TGSA/lib/python3.6/site-packages/torch_sparse/__init__.py", line 28, in <module>
    f'Detected that PyTorch and torch_sparse were compiled with '
RuntimeError: Detected that PyTorch and torch_sparse were compiled with different CUDA versions. PyTorch has CUDA version 10.2 and torch_sparse has CUDA version 11.4. Please reinstall the torch_sparse that matches your PyTorch install.
wilke commented 1 year ago

Command: singularity run --nv --bindpwd/tmp:/candle_data_dir build/TGSA.sif train.sh 1 /candle_data_dir --epochs 1 Status: Failed

Output:

CMD = python /usr/local/TGSA/candle_train.py --epochs 1
/usr/local/TGSA/train.sh: line 59: ./candle_glue.sh: No such file or directory
using original data placed in /candle_data_dir
/candle_data_dir//Data
/usr/local/TGSA/train.sh: line 77: ./candle_glue.sh: No such file or directory
using original data placed in /candle_data_dir//Data
using CUDA_VISIBLE_DEVICES 1
...

 'weight_path': ''}
Training on Pilot1 dataset
Traceback (most recent call last):
  File "/usr/local/TGSA/candle_train.py", line 205, in <module>
    main()
  File "/usr/local/TGSA/candle_train.py", line 202, in main
    run(gParams)
  File "/usr/local/TGSA/candle_train.py", line 111, in run
    os.makedirs(output_root_dir)
  File "/opt/conda/lib/python3.7/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/conda/lib/python3.7/os.py", line 223, in makedirs
    mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/usr/local/TGSA/benchmark_dataset_generator/improve_data_dir'
wilke commented 1 year ago

Command works but not printing IMPROVE_RESULT

RylieWeaver commented 1 year ago

Fixed and pull requested to improve branch

wilke commented 1 year ago

Starting over for next release cycle.