Discrepancy in Results Between W&B Tuning and Local Workflow

fy711 commented 1 month ago

Description

I've encountered an issue while trying to reproduce hyperparameter tuning results from Weights & Biases in my local workflow using the Scyan package.

Steps to Reproduce

Performed hyperparameter tuning using Hydra from the terminal with the following command:

python -m scripts.run -m project=my_project trainer=mps wandb.mode=online wandb.save_umap=true save_predictions=true

Selected a specific set of hyperparameters based on W&B results.

Attempted to replicate these results in my main workflow using the following code:

adata, table = scyan.data.load("my_project")
model = scyan.Scyan(adata, table, **model_hyperparameters_from_wb)
model.fit()
model.predict()

Expected Behavior

The prediction results in my main workflow should match those observed in Weights & Biases.

Actual Behavior

The prediction results in my main workflow vary from what I see in Weights & Biases.

Environment

Scyan version: 1.6.2 (local pip install dev mode)
Python version: 3.10
Operating System: macOS-14.6-arm64

Questions

Is the discrepancy with reproducing W&B results in local workflows expected or is it a issue?
Are there any additional steps I should take to ensure consistency between W&B and local runs?
How can I access the prediction results created by hydra?
Are there any updates on using the MPS trainer? Currently I'm using the fallback option but it will be fantastic if I can utilize GPU.

Thank you very much for your help and for developing this wonderful package!

quentinblampey commented 1 month ago

Hi @fy711, Are the results very different with or without W&B? I think it may be simply due to the random seed. In the W&B scripts, we change the seed at each run (see here). Can you try to set it at 0 at each step, and see if the results are the same?

By default, the predictions are not saved when using W&B. You can do so by setting save_predictions to true in the config here, and the results should be stored as a CSV in the hydra log directory. Let me know if that works for you!

Regarding the MPS trainer, this actually depends on Pytorch Lightning, so I can't really tell when it will be fully supported...

fy711 commented 1 month ago

Hi Quentin,

Thank you for the detailed answers! I’ll incorporate your suggestions into my workflow and check the results. I appreciate your help and closing this issue for now. Thanks again!

MICS-Lab / scyan