rs-station / careless

Merge X-ray diffraction data with Wilson's priors, variational inference, and metadata
MIT License
16 stars 6 forks source link

TruncatedNormal InvalidArgumentError #159

Closed DHekstra closed 4 months ago

DHekstra commented 5 months ago

From @madanmx

I was trying to merge the reprocessed datsets. The commands are;

careless mono \
  --mlp-layers=10 \
  --image-layers=2 \
  --dmin=1.8 \
  --iterations=1000 \
  --merge-half-datasets \
  "dHKL,s1x,s1y,ewald_offset" \
  p1p2.mtz \
  merge/p1p2

Upon execution I am getting the following error;

Training:  41%|████▏     | 413/1000 [04:34<06:30,  1.50it/s, loss=nan, F KLDiv=1.20e+03, NLL=nan]
Traceback (most recent call last):
  File "/home/madan/.conda/envs/careless/bin/careless", line 8, in <module>
    sys.exit(main())
  File "/home/madan/.local/lib/python3.7/site-packages/careless/careless.py", line 9, in main
    run_careless(parser)
  File "/home/madan/.local/lib/python3.7/site-packages/careless/careless.py", line 55, in run_careless
    validation_data=test,
  File "/home/madan/.local/lib/python3.7/site-packages/careless/models/merging/variational.py", line 172, in train_model
    _history = train_step((self, data))
  File "/home/madan/.local/lib/python3.7/site-packages/careless/models/merging/variational.py", line 159, in train_step
    history = model.train_step((data,))
  File "/home/madan/.local/lib/python3.7/site-packages/keras/engine/training.py", line 993, in train_step
    y_pred = self(x, training=True)
  File "/home/madan/.local/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/madan/.local/lib/python3.7/site-packages/careless/models/merging/variational.py", line 121, in call
    z_f = self.surrogate_posterior.sample([self.mc](http://self.mc/)_sample_size)
  File "/home/madan/.local/lib/python3.7/site-packages/careless/models/merging/surrogate_posteriors.py", line 50, in sample
    s = self.distribution.sample(*args, **kwargs)
  File "/home/madan/.local/lib/python3.7/site-packages/tensorflow_probability/python/distributions/distribution.py", line 1234, in sample
    return self._call_sample_n(sample_shape, seed, **kwargs)
  File "/home/madan/.local/lib/python3.7/site-packages/tensorflow_probability/python/distributions/distribution.py", line 1212, in _call_sample_n
    n, seed=seed() if callable(seed) else seed, **kwargs)
  File "/home/madan/.local/lib/python3.7/site-packages/tensorflow_probability/python/distributions/truncated_normal.py", line 257, in _sample_n
    seed=samplers.sanitize_seed(seed))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Exception encountered when calling layer "variational_merging_model" "                 f"(type VariationalMergingModel).
{{function_node __wrapped__StatelessParameterizedTruncatedNormal_device_/job:localhost/replica:0/task:0/device:CPU:0}} Invalid parameters [Op:StatelessParameterizedTruncatedNormal]
Call arguments received by layer "variational_merging_model" "                 f"(type VariationalMergingModel):
  • inputs=('tf.Tensor(shape=(1173958, 1), dtype=int64)', 'tf.Tensor(shape=(1173958, 1), dtype=int64)', 'tf.Tensor(shape=(1173958, 4), dtype=float32)', 'tf.Tensor(shape=(1173958, 1), dtype=float32)', 'tf.Tensor(shape=(1173958, 1), dtype=float32)')
kmdalton commented 5 months ago

This is a generic error that can happen any time there is a numerical issue in optimization. From the progress bar it is clear the issue came from the likelihood term of the objective. This could possibly indicate that there are some unusually large intensities and/or very small uncertainties in the data set. In the current version of careless, the program should terminate gracefully if this happens rather than give this cryptic error. I suggest trying the latest version as there may also be other numerical improvements.