hmsc-r / hmsc-hpc

13 stars 4 forks source link

Error when fitting spatial model to single species #18

Closed claraqin closed 5 days ago

claraqin commented 2 weeks ago

Hello,

First, thanks for providing such a great resource to the community.

I've been encountering an error when I attempt to use Hmsc-HPC to fit a spatial model (i.e. with spatially explicit random effects) to the distribution of a single species: ValueError: 'params['Psi']' has shape (None, None) after one iteration, which does not conform with the shape invariant (None, 1). The full traceback is appended to the bottom of this message.

Below is a script to reproduce this error with simulated data. The script is based on an example in "Joint Species Distribution with Modelling" by Otso Ovaskainen and Nerea Abrego. Specifically, it's from section 5.6.9 ("Spatial Random Effects").

# Simulate data
n = 100
beta1 = 0
beta2 = 1
sigma = 1
sigma.spatial = 2
alpha.spatial = 0.5
x = rnorm(n)
L = beta1 + beta2*x
xycoords = matrix(runif(2*n), ncol = 2)
Sigma = sigma.spatial^2* exp(-as.matrix(stats::dist(xycoords))/alpha.spatial)
a = mvrnorm(mu=rep(0,n), Sigma = Sigma)
y=L+a+rnorm(n, sd = sigma)

# Configure Hmsc model
sample.id = as.factor(1:n)
studyDesign = data.frame(sample = sample.id)
rownames(xycoords) = sample.id
rL = HmscRandomLevel(sData = xycoords)
XData = data.frame(x)
Y = as.matrix(y)
m = Hmsc(Y = Y, XData = XData, XFormula = ~ x, studyDesign = studyDesign, ranLevels = list("sample" =rL))

init_file_path_example <- file.path("models", "example_init_spatial.rds")
post_file_path_example <- file.path("models", "example_post_spatial.rds")

# Fit locally
m.local <- sampleMcmc(m, transient=1000, samples=200, thin=5, verbose=100,
                      nChains=2, updater=list(Gamma2=FALSE, GammaEta=FALSE))
# Success!

# Fit on HPC
sampleMcmc(m, transient=1000, samples=200, thin=5, verbose=100,
           nChains=2, engine="HPC", updater=list(Gamma2=FALSE, GammaEta=FALSE)) |>
  to_json() |>
  saveRDS(file = init_file_path_example)

# Copy file to HPC.
# Then on the HPC, load the necessary dependencies and run hmsc-hpc via Terminal:
# python3 -m hmsc.run_gibbs_sampler --input 'models/example_init_spatial.rds' --output 'models/example_post_spatial.rds' --samples 200 --transient 1000 --thin 5 --verbose 100 
# Error!

Here's the full traceback:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/hmsc-hpc/hmsc/run_gibbs_sampler.py", line 266, in <module>
    run_gibbs_sampler(
  File "/home/ubuntu/hmsc-hpc/hmsc/run_gibbs_sampler.py", line 82, in run_gibbs_sampler
    parSamples = gibbs.sampling_routine(
  File "/home/ubuntu/hmsc-venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filea8f_xpk_.py", line 380, in tf__sampling_routine
    ag__.for_stmt(ag__.converted_call(ag__.ld(tf).range, (ag__.ld(step_num),), None, fscope), None, loop_body, get_state_21, set_state_21, ('mcmcSamplesAlphaInd', 'mcmcSamplesBeta', 'mcmcSamplesBetaSel', 'mcmcSamplesDelta', 'mcmcSamplesDeltaRRR', 'mcmcSamplesEta', 'mcmcSamplesGamma', 'mcmcSamplesLambda', 'mcmcSamplesPsi', 'mcmcSamplesPsiRRR', 'mcmcSamplesRhoInd', 'mcmcSamplesSigma', 'mcmcSamplesiV', 'mcmcSampleswRRR', "params['AlphaInd']", "params['Beta']", "params['BetaSel']", "params['Delta']", "params['DeltaRRR']", "params['Eta']", "params['Gamma']", "params['Lambda']", "params['Psi']", "params['PsiRRR']", "params['Xeff']", "params['Z']", "params['iD']", "params['iV']", "params['poisson_omega']", "params['rhoInd']", "params['sigma']", "params['wRRR']", 'hmc_es', 'hmc_las', 'hmc_ss'), {'shape_invariants': [(ag__.ld(params)['Eta'], [ag__.ld(tf).TensorShape([ag__.ld(npVec)[ag__.ld(r)], None]) for r in ag__.ld(range)(ag__.ld(nr))]), (ag__.ld(params)['Lambda'], [ag__.ld(tf).TensorShape([None, ag__.ld(ns)])] * ag__.ld(nr)), (ag__.ld(params)['Psi'], [ag__.ld(tf).TensorShape([None, ag__.ld(ns)])] * ag__.ld(nr)), (ag__.ld(params)['Delta'], [ag__.ld(tf).TensorShape([None, 1])] * ag__.ld(nr)), (ag__.ld(params)['AlphaInd'], [ag__.ld(tf).TensorShape(None)] * ag__.ld(nr))], 'iterate_names': 'n'})
ValueError: in user code:

    File "/home/ubuntu/hmsc-hpc/hmsc/gibbs_sampler.py", line 105, in sampling_routine  *
        for n in tf.range(step_num):

    ValueError: 'params['Psi']' has shape (None, None) after one iteration, which does not conform with the shape invariant (None, 1).

Do you have any insights into what might be happening?

With gratitude, Clara

claraqin commented 2 weeks ago

For what it's worth, I was able to sidestep this issue by adding a "dummy species" made up of random noise as a second species in the model. It seems like TensorFlow simply doesn't take well to a single-species spatial model for some reason.

gtikhonov commented 2 weeks ago

Hello,

Thanks a lot for providing a detailed example to replicate the issue that you have encountered!

I confirm that I have run into same error as you did. This type of error is known to us and it is related to the changing shapes of the model parameters due to adaptive number of random factors - something that TensorFlow is not very happy about by default. Though, the source of this particular issue with 1 species is rather mysterious to me as I cannot clearly pinpoint the operation that might change the shape.

Anyway, I have pushed a hotfix for this issue (commit cdc45c3) and tested that your example runs smoothly. Please let us know if this also solves the issue at your side.

claraqin commented 1 week ago

Thanks @gtikhonov . Just acknowledging that I received your message, but haven't had a chance to test the fix yet. I will do so in the next couple of days.

claraqin commented 5 days ago

It works! Thanks again for your help.