Closed dgcnz closed 5 months ago
Example usage of downloading and loading from wandb.
import wandb
from src.utils.wandb import get_all_checkpoints, download_artifact
from src.models.wang2022_module import Wang2022LightningModule
from pathlib import Path
# get all checkpoints from run id
entity = "uva-dl2"
project = "wang2022"
run_id = "bpiojh4w"
checkpoints = get_all_checkpoints(run_id, project, entity)
print(checkpoints)
with wandb.init(project=project, entity=entity, job_type="run-evaluation-test") as run:
artifact_dir = download_artifact(run, checkpoints[0], project, entity)
model = Wang2022LightningModule.load_from_checkpoint(Path(artifact_dir) / "model.ckpt")
print(model)
done
Description
This issue concerns the extension of Wang 2022 figure 4 which consists on fixing a dataset's equivariance (levels = [full, partial]) and training a relaxed equivariant model with
k
differentalpha
.So we have a total of
2 * k
runs per model, where 2 represents the number of datasets andk
represents the number of differentalpha
tested.Configs
SmokePlume configs:
configs/data/wang2022/equivariance_test.yaml
(✅) (see Q1)Model default configs (DO NOT MODIFY THESE FILES, other experiment files rely on the defaults set here):
configs/model/wang2022/convnet.yaml
(✅)configs/model/wang2022/rgroup.yaml
(❓)configs/model/wang2022/rsteer.yaml
(✅)Experiment configs:
configs/experiment/wang2022/equivariance_test/convnet.yaml
(✅)configs/experiment/wang2022/equivariance_test/rgroup.yaml
(❓)configs/experiment/wang2022/equivariance_test/rsteer.yaml
(✅)These configs have to be at least tested with
trainer.fast_dev_run
to ensure that the model even processes data correctly. This doesn't account for model checkpointing and early stopping, so we'll have to add tests to that. Examples can be found in the Makefile's commandtest_wang2022_figure_4
which you can runmake test_wang2022_figure_4
.Example testing command:
Legend:
Tasks
equivariance_level
,alpha
,model
) and runs the corresponding experimentequivariance_level
,alpha
andmodel
and launches all SLURM jobs (if snellius goes back on time)Questions
10
remaining time steps for testing instead of testing on the validation?Feel free to add more questions or tasks