stan-dev / cmdstanpy

CmdStanPy is a lightweight interface to Stan for Python users which provides the necessary objects and functions to compile a Stan program and fit the model to data using CmdStan.
BSD 3-Clause "New" or "Revised" License
152 stars 69 forks source link

Error during sampling #650

Open yingyuctw opened 1 year ago

yingyuctw commented 1 year ago

Summary:

Please provide a short couple sentence summary. I try to run cmdstanpy. But it keep showing error.

Description:

Describe the issue as clearly as possible. I am using pycharm to run the code. I check error code on stackflow. In docker, it is issue about c++. So, I try to update the c++ version. It does not work.

Additional Information:

Provide any additional information here. raise RuntimeError(msg) `RuntimeError: Error during sampling:

Command and output files: RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4 cmd (chain 1): ['D:\xxx\Dropbox\Fall2022\dotstudy-master\data\dotstudy\lnrm2_v2.exe', 'id=1', 'random', 'seed=98393', 'data', 'file=C:\\Users\\xxx\\AppData\\Local\\Temp\\tmp36qi7n31\\_mweuysg.json', 'init=C:\\Users\\lalor\\AppData\\Local\\Temp\tmp36qi7n31\l4eb0hu8.json', 'output', 'file=C:\Users\lalor\AppData\Local\Temp\tmp36qi7n31\lnrm2_v2zw2ep36n\\lnrm2_v2-20230130162342_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1'] retcodes=[3221225781, 3221225781, 3221225781, 3221225781] per-chain output files (showing chain 1 only): csv_file: C:\Users\xxx\AppData\Local\Temp\tmp36qi7n31\lnrm2_v2zw2ep36n\lnrm2_v2-20230130162342_1.csv console_msgs (if any): C:\Users\xxx\AppData\Local\Temp\tmp36qi7n31\lnrm2_v2zw2ep36n\lnrm2_v2-20230130162342_0-stdout.txt Consider re-running with show_console=True if the above output is unclear!

Current Version:

Please include the output of import cmdstanpy; cmdstanpy.show_versions(), or at least the cmdstan and cmdstanpy versions used. 1.0.8

WardBrian commented 1 year ago

Can you run the code with 'show_console=True` and paste that output?

Are you able to share your Stan code?

yingyuctw commented 1 year ago

11:59:00 - cmdstanpy - INFO - Chain [1] start processing 11:59:00 - cmdstanpy - INFO - Chain [2] start processing 11:59:00 - cmdstanpy - INFO - Chain [3] start processing 11:59:00 - cmdstanpy - INFO - Chain [4] start processing 11:59:00 - cmdstanpy - INFO - Chain [1] done processing 11:59:00 - cmdstanpy - ERROR - Chain [1] error: terminated by signal 3221225653 11:59:00 - cmdstanpy - INFO - Chain [2] done processing 11:59:00 - cmdstanpy - ERROR - Chain [2] error: terminated by signal 3221225653 11:59:00 - cmdstanpy - INFO - Chain [4] done processing 11:59:00 - cmdstanpy - ERROR - Chain [4] error: terminated by signal 3221225653 11:59:00 - cmdstanpy - INFO - Chain [3] done processing 11:59:00 - cmdstanpy - ERROR - Chain [3] error: terminated by signal 3221225653 Traceback (most recent call last): File "D:\xxx\Dropbox\Fall2022\dotstudy-master\data\dotstudy\dotstudy_ver4(5).py", line 759, in prx_levels = find_salience(dat_prx, 3, 1) File "D:\xxx\Dropbox\Fall2022\dotstudy-master\data\dotstudy\adaptive_SFT.py", line 198, in find_salience Chain [2] Chain [2] Chain [2] Chain [2] Chain [1] Chain [1]

Chain [2] Chain [1] Chain [1] Chain [1] Chain [1] Chain [2]

Chain [1] Chain [2] Chain [1] Chain [1] Chain [2] Chain [2] Chain [1] Chain [2] Chain [2] Chain [2] Chain [2] Chain [2] Chain [2] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [4] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] Chain [3] fit_model = sm.sample(data=dat, inits=init_dict,show_console=True) File "D:\xxx\Dropbox\Fall2022\venv\lib\site-packages\cmdstanpy\model.py", line 1188, in sample raise RuntimeError(msg) RuntimeError: Error during sampling:

Command and output files: RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4 cmd (chain 1): ['D:\xxx\Dropbox\Fall2022\dotstudy-master\data\dotstudy\lnrm2_v2.exe', 'id=1', 'random', 'seed=86162', 'data', 'file=C:\Users\lalor\AppData\Local\Temp\tmpjq2dphzl\6273ssf8.json', 'init=C:\Users\lalor\AppData\Local\Temp\tmpjq2dphzl\efcagbg7.json', 'output', 'file=C:\Users\lalor\AppData\Local\Temp\tmpjq2dphzl\lnrm2_v2k4xkqkgr\lnrm2_v2-20230206115900_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1'] retcodes=[3221225781, 3221225781, 3221225781, 3221225781] per-chain output files (showing chain 1 only): csv_file: C:\Users\xxx\AppData\Local\Temp\tmpjq2dphzl\lnrm2_v2k4xkqkgr\lnrm2_v2-20230206115900_1.csv console_msgs (if any): C:\Users\xxx\AppData\Local\Temp\tmpjq2dphzl\lnrm2_v2k4xkqkgr\lnrm2_v2-20230206115900_0-stdout.txt Consider re-running with show_console=True if the above output is unclear!

bob-carpenter commented 1 year ago

Hi, @yingyuctw. Can you try running it without dropbox and with argument show_console=True? The latter should actually provide the content of the error messages.

You can try looking in the temp directory specified to see if the output is actually there.

It looks like it's failing at the end while manipulating files of draws so it may just be a write permission failure.

WardBrian commented 1 year ago

3221225653 seems like it is sometimes used as the error code for device IO timeout, which certainly could be dropbox/permissions related.

yingyuctw commented 1 year ago

I don't know why it generate lnrm2_v2.exe in dropbox. I move the folder to different space. It still generates in dropbox. Anyway to reset it?

Traceback (most recent call last): File "D:\lalor\Documents\WORK\dotstudy_ver4(5).py", line 759, in prx_levels = find_salience(dat_prx, 3, 1) File "D:\lalor\Documents\WORK\adaptive_SFT.py", line 198, in find_salience fit_model = sm.sample(data=dat, inits=init_dict,show_console=True) File "C:\Program Files\PsychoPy\lib\site-packages\cmdstanpy\model.py", line 1188, in sample raise RuntimeError(msg) RuntimeError: Error during sampling:

Command and output files: RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4 cmd (chain 1): ['D:\lalor\Dropbox\Fall2022\dotstudy-master\data\dotstudy\lnrm2_v2.exe', 'id=1', 'random', 'seed=37954', 'data', 'file=C:\Users\lalor\AppData\Local\Temp\tmp_k5lnenz\njipmno6.json', 'init=C:\Users\lalor\AppData\Local\Temp\tmp_k5lnenz\bnax8cq1.json', 'output', 'file=C:\Users\lalor\AppData\Local\Temp\tmp_k5lnenz\lnrm2_v2z72ov65v\lnrm2_v2-20230206195238_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1'] retcodes=[-1, -1, -1, -1] per-chain output files (showing chain 1 only): csv_file: C:\Users\lalor\AppData\Local\Temp\tmp_k5lnenz\lnrm2_v2z72ov65v\lnrm2_v2-20230206195238_1.csv console_msgs (if any): C:\Users\lalor\AppData\Local\Temp\tmp_k5lnenz\lnrm2_v2z72ov65v\lnrm2_v2-20230206195238_0-stdout.txt Consider re-running with show_console=True if the above output is unclear!

yingyuctw commented 1 year ago

I upload the code I use to gist. I hope it help. https://gist.github.com/yingyuctw/fa3e85c3219f85244fee7888b922f07c https://gist.github.com/yingyuctw/a7e585955c4a1c04d6872d49c01083e2 https://gist.github.com/yingyuctw/143e86ff59982a7fd9978cfbd72b857e

yingyuctw commented 1 year ago

I try to update the code.

sm =CmdStanModel(stan_file='D:\xxx\Documents\WORK\lnrm2_v2.stan',exe_file='D:\xxx\Documents\WORK')

It still generate error. I remove the folder from dropbox, It still show same error

Command and output files: RunSet: chains=4, chain_ids=[1, 2, 3, 4], num_processes=4 cmd (chain 1): ['D:\lalor\Dropbox\Fall2022\dotstudy-master\data\dotstudy\lnrm2_v2.exe', 'id=1', 'random', 'seed=28425', 'data', 'file=C:\Users\lalor\AppData\Local\Temp\tmpmm_alrxz\yrqwbvyb.json', 'init=C:\Users\lalor\AppData\Local\Temp\tmpmm_alrxz\0cwc3j2t.json', 'output', 'file=C:\Users\lalor\AppData\Local\Temp\tmpmm_alrxz\lnrm2_v2wkejlf4h\lnrm2_v2-20230211205639_1.csv', 'method=sample', 'algorithm=hmc', 'adapt', 'engaged=1'] retcodes=[-1, -1, -1, -1] per-chain output files (showing chain 1 only): csv_file: C:\Users\lalor\AppData\Local\Temp\tmpmm_alrxz\lnrm2_v2wkejlf4h\lnrm2_v2-20230211205639_1.csv console_msgs (if any): C:\Users\lalor\AppData\Local\Temp\tmpmm_alrxz\lnrm2_v2wkejlf4h\lnrm2_v2-20230211205639_0-stdout.txt

ahartikainen commented 1 year ago

Can you share the code you run? Do you use any caching modules?

yingyuctw commented 1 year ago

@ahartikainen https://gist.github.com/yingyuctw/fa3e85c3219f85244fee7888b922f07c https://gist.github.com/yingyuctw/a7e585955c4a1c04d6872d49c01083e2 https://gist.github.com/yingyuctw/143e86ff59982a7fd9978cfbd72b857e

The code I am using. Dotstudy call adapt_SFT run cmdstanpy The Stan model is lnrm2_v2.stan.

ahartikainen commented 1 year ago

Maybe your code is still using the pickled model? 'I recommend trying to create a simple script that just compiles the model and then samples.

Now there is a lot of going on. e.g. by default .exe is created to same folder as .stan I think I saw dropbox folder there.

Then another script first checks if pickled model is found (in what folder, same as in python file, because that calls chdir?).

Also, what is the purpose pickling?

yingyuctw commented 1 year ago

Now there is a lot of going on. e.g. by default .exe is created to same folder as .stan I think I saw dropbox folder there. @ahartikainen

Can you tell me which line? I think I removed everything.

ahartikainen commented 1 year ago

Not sure https://gist.github.com/yingyuctw/fa3e85c3219f85244fee7888b922f07c#file-dotstudy-py-L73

Any case, can you create a minimal example where you have data as json and stan file etc in a same folder as your cmdstanpy script. Don't use chdir etc commands.

yingyuctw commented 1 year ago

@ahartikainen I change the path you mention. It still the same. I will try to rewrite the code. It was written by other people. I am not fully understand how the code work, including pickel model. Thanks for your advice.