aniskhan25 / hmsc-hpc

12 stars 3 forks source link

Error when disable verbose: setting `--verbose 0` #15

Open elgabbas opened 2 months ago

elgabbas commented 2 months ago

Hello,

I noticed that setting --verbose 0 to disable progress messages throws an error. It was not clear where the error came from. I changed a lot of parameters/options until I discovered that --verbose 0 was the reason.

Here is a reprex

library(Hmsc)
library(jsonify)
Sys.setenv(TF_CPP_MIN_LOG_LEVEL = 3)
# python <- "PATH/TO/PYTHON/ENVIRONMENT"

nSamples = 100
thin = 2
nChains = 4
transient = nSamples * thin

m = Hmsc(
  Y=TD$Y, XData=TD$X, XFormula=~., TrData=TD$Tr[,-1], TrFormula=~., phyloTree=TD$phy,
  studyDesign=TD$studyDesign, ranLevels=list(plot=TD$rL1, sample=TD$rL2))
init_obj = sampleMcmc(
  m, samples=nSamples, thin=thin, transient=transient, nChains=nChains, verbose=verbose, engine="HPC")
init_file_path = file.path(getwd(), "init_file.rds")
saveRDS(to_json(init_obj), file=init_file_path)

post_file_path = file.path(getwd(), "post_file.rds")

This works

verbose = 100
python_cmd_args = paste(
  "-m hmsc.run_gibbs_sampler", "--input", shQuote(init_file_path), "--output", shQuote(post_file_path),
  "--samples", nSamples, "--transient", transient, "--thin", thin, "--verbose", verbose)
system2(python, python_cmd_args)

This failed

verbose = 0
python_cmd_args = paste(
  "-m hmsc.run_gibbs_sampler", "--input", shQuote(init_file_path), "--output", shQuote(post_file_path),
  "--samples", nSamples, "--transient", transient, "--thin", thin, "--verbose", verbose)
system2(python, python_cmd_args)

***\hmsc-venv\lib\site-packages\hmsc\utils\import_utils.py:129: RuntimeWarning: divide by zero encountered in divide
  tmp = distMat / rLPar["alphapw"][:,0,None,None]
***\hmsc-venv\lib\site-packages\hmsc\utils\import_utils.py:129: RuntimeWarning: invalid value encountered in divide
  tmp = distMat / rLPar["alphapw"][:,0,None,None]
sampling
Traceback (most recent call last):
  File "C:\PROGRA~3\ANACON~1\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\PROGRA~3\ANACON~1\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "***\hmsc-venv\lib\site-packages\hmsc\run_gibbs_sampler.py", line 266, in <module>
    run_gibbs_sampler(
  File "***\hmsc-venv\lib\site-packages\hmsc\run_gibbs_sampler.py", line 82, in run_gibbs_sampler
    parSamples = gibbs.sampling_routine(
  File "***\hmsc-venv\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "***\hmsc-venv\lib\site-packages\tensorflow\python\eager\execute.py", line 53, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node while/mod defined at (most recent call last):
  File "C:\PROGRA~3\ANACON~1\lib\runpy.py", line 197, in _run_module_as_main
  File "C:\PROGRA~3\ANACON~1\lib\runpy.py", line 87, in _run_code
  File "***\hmsc-venv\lib\site-packages\hmsc\run_gibbs_sampler.py", line 266, in <module>
  File "***\hmsc-venv\lib\site-packages\hmsc\run_gibbs_sampler.py", line 82, in run_gibbs_sampler
  File "***\hmsc-venv\lib\site-packages\hmsc\gibbs_sampler.py", line 105, in sampling_routine
  File "***\hmsc-venv\lib\site-packages\hmsc\gibbs_sampler.py", line 201, in sampling_routine

Integer division by zero
     [[{{node while/mod}}]] [Op:__inference_sampling_routine_3850]
args=Namespace(samples=100, transient=200, thin=2, chains=None, input='***/init_file.rds', output='***/post_file.rds', verbose=0, hmcleapfrog=10, hmcthin=0, updbe=0, tnlib='tf', fse=1, profile=0, rngseed=0, fp=64)
working directory ***
Initializing TF graph
retracing

Thanks

aniskhan25 commented 2 months ago

A quick fix to the condition, if verbose != 0 and (n + 1) % verbose == 0: