lnccbrown / HSSM

Development of HSSM package
Other
82 stars 11 forks source link

Fix nan gradients in analytical likelihood #468

Closed digicosmos86 closed 3 months ago

AlexanderFengler commented 4 months ago

@digicosmos86 is this stale for now?

digicosmos86 commented 4 months ago

@digicosmos86 is this stale for now?

There doesn't seem to be a solution for really small RTs in the denominator, which can blow up

frankmj commented 4 months ago

ah well maybe this is related to my "RT hack" I proposed as an interim solution for cases where t is low. (Which I attributed to the possibility that the sampler would end up proposing values of t that hit the lower bound and lead to unstable gradients, but I know that pymc is supposed to deal with such boundaries smoothly under the hood - so maybe the issue is just the RTs in the denominator being small). In that case the RT hack would still work (ie under the hood befor fitting just add a constant value to all RTs (say 0.5), which should only shift the t parameter, and then report t_new = t - 0.5).

On Tue, Jul 9, 2024 at 8:34 AM Paul Xu @.***> wrote:

@digicosmos86 https://github.com/digicosmos86 is this stale for now?

There doesn't seem to be a solution for really small RTs in the denominator, which can blow up

— Reply to this email directly, view it on GitHub https://github.com/lnccbrown/HSSM/pull/468#issuecomment-2217563269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG7TFGTTCFH2PBQAF525KLZLPKEDAVCNFSM6AAAAABJUTJUAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJXGU3DGMRWHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

digicosmos86 commented 4 months ago

@frankmj I ran a few more tests and the RT-hack did do the trick. It might be hard for us to implement this trick in our code though, mostly because people use arviz functions instead of the convenience functions that we provide, which could give us some control over the output. We could note this in our documentation somewhere about this trick so that the users can implement this themselves so that they have full control

frankmj commented 4 months ago

Great. maybe one simple solution would be to simply add a link function with t= t'-const ?

On Tue, Jul 9, 2024 at 9:47 AM Paul Xu @.***> wrote:

@frankmj https://github.com/frankmj I ran a few more tests and the RT-hack did do the trick. It might be hard for us to implement this trick in our code though, mostly because people use arviz functions instead of the convenience functions that we provide, which could give us some control over the output. We could note this in our documentation somewhere about this trick so that the users can implement this themselves so that they have full control

— Reply to this email directly, view it on GitHub https://github.com/lnccbrown/HSSM/pull/468#issuecomment-2217795650, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG7TFDEGDMAENAMCYVOMCDZLPSY5AVCNFSM6AAAAABJUTJUAKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJXG44TKNRVGA . You are receiving this because you were mentioned.Message ID: @.***>

digicosmos86 commented 4 months ago

@frankmj That's a great idea! I also noticed that the RT-hack only worked when float64 is used, indicating some other issues that we might have with numerical stability. I'll look deeper into this

digicosmos86 commented 3 months ago

@cpaniaguam Thanks for the suggestions! I committed all excluding those involving pt.switch. I thought pt.switch is a more readable alternative, but it caused some switch_sink errors that shows up only when float64 is used. Actually removing switch Ops allowed me to sample with float64 with out errors.

Please feel free to take this further. This PR wasn't final - was just a placeholder for some of my experiments

AlexanderFengler commented 3 months ago

@digicosmos86 let's use this PR to switch to float64 overall?

Also, the latest state of affairs with all changes in this PR is that it's still breaking right?

digicosmos86 commented 3 months ago

You are correct. It is still broken. This PR is kind of my mess though. I'd rather start a new one and just switch out all the switch ops, which should get us over the float64 issue

AlexanderFengler commented 3 months ago

@digicosmos86 I am good with that approach.

digicosmos86 commented 3 months ago

Since this is still in the works, I am going to convert it to a draft PR

AlexanderFengler commented 3 months ago

@digicosmos86 to be closed now that the other PR is up?