Closed ekarais closed 4 months ago
Replace the environment.yaml
with the next one:
name: ten
channels:
- pytorch
- nvidia
- anaconda
- conda-forge
- defaults
- pyg
dependencies:
- coverage=7.4.3
- gudhi=3.8.0
- matplotlib=3.7.2
- networkx=3.1
- numpy=1.22.4
- pandas=1.4.2
- pip=23.2.1
- conda-forge::pre-commit=3.6.0
- pyg::pyg
- pygments
- pexpect
- pytest=7.4.4
- python=3.10
- pytorch=2.1.0
- pytorch-cuda=11.8
- pyg::pytorch-scatter
- rdkit
- scipy=1.11.3
- seaborn
- tqdm=4.66.1
- wandb=0.15.12
- pip:
- git+https://github.com/pyt-team/TopoNetX@cede811485aefcff1d013dbb94942e8f92ac5d05
Regarding the wandb-init
replace the original line with: wandb.init(entity='ten-harvard', project=f"QM9-{args.target_name}")
There haven't been any issues training on the clusters, so this is a non-issue. Closing.
we need the following changes to run training on clusters:
environment.yaml
: addnvidia
to channels and addpytorch-cuda=11.8
to dependencies.main_qmp9.py
: adapt thewandb.init
call so that the run gets logged to the team accountten-harvard
(how? @gdasoulas)