Closed sideDesert closed 3 months ago
I have added a few additional logs which are reflected in the Current Behaviour section. I have made no code changed in the file otherwise.
Hi, thanks for the feedback -- I was dealing with the dependency problem and I just pushed an update to the repo. It seems you have addressed the dependency problem at this issue. There is one thing I want to clarify a bit in the reproduction steps: There is no create_dataset.py
in ./thermostat
? I guess you are talking about download_data.py
? We should run python download_data.py -c configs/yelp_polarity/bert/svs-3600.jsonnet -home .
first before running run_explainer.py
.
Then, for your observed stuck running, although I cannot run experiments on my borrowed Macbook, I successfully replicated this using a CPU-only machine. I intentionally modify the n_samples
in thermostat/configs/yelp_polarity/bert/svs-3600.jsonnet
to be 1 for debugging purposes and it gives me 11 minutes (in comparison, using the same config on an A40 GPU only takes 6 seconds, which is 110x faster). So if we use the original svs-3600.jsonnet
config, it would take 11*25/60=4.58 hours for only a simple sample on CPU! That is why we see the program is "stuck" -- it is not really stuck, it just runs too slowly.
The thermostat code is not optimized for running on a CPU or on an M2 chip and running a BERT-level language model does take a lot of time if there is no specific accelerator like GPU because the codes here are normal pytorch huggingface codes and there is no specific optimization for CPU/M2. To assist with reproduction purpose, I have uploaded my computed svs-3600*-ish outputs files and would update the repo correspondingly. Check out README.md.
Also, if you want to replicate my further experiments like training amortized model, I would suggest you run on GPU rather than on an M2 chip as all of my codes are not optimized for the M2 chip as well. Unfortunately, I do not have the appropriate machine to do that and do not have the bandwidth to implement M2 support.
Close this issue as the problem seems to be solved after offline discussion. Feel free to reopen it if needed.
I am trying to run the
thermostat/run_explainer.py
using the bash command which is given in the documentation. After downloading the dataset, and then running the file, the file seems to get stuck after some time on 0.Expected Behavior
To shaply values to be calculated by thermostat without any errors
Current Behavior
Code execution seems to stop at 0 and doesn't move forward.
Possible Solution
Steps to Reproduce
condo create -p ./py38 python=3.8
condo activate ./py38
thermostat/experiments/thermostat/yelp_polarity/bert/svs-3600
thermostat/expermients/datasets/yelp_polarity
python thermostat/create_datasets.py
bash run.sh task=yelp_polarity model=bert explainer=svs-3600 seed=1 batch_size=1 device=0
Context (Environment)
Python version - 3.8.19 (Conda Environment) Machine - MacBook Air M2 macOS Sonoma 14.4 pip version 24.0
Pip List