ssmit1986 / BayesianInference

Wolfram Language application for Bayesian inference and Gaussian process regression
MIT License
37 stars 4 forks source link

warning with defineInferenceProblem #9

Closed adhikarirsr closed 3 years ago

adhikarirsr commented 3 years ago

Hi,

Thanks for this package.

I am trying to do this (which is basically an example that you showed in the first part of this talk: https://www.youtube.com/watch?v=WzSrbsACdTY

dat1 = RandomVariate[BinomialDistribution[Length[bindata], 0.7], 1]
obj = defineInferenceProblem[
   "Data" -> dat1,
   "GeneratingDistribution" -> 
    BinomialDistribution[Length[bindata], p],
   "Parameters" -> {p, 0, 1},
   "PriorDistribution" -> BetaDistribution[0.5, 0.5]
  ];

Since BetaDistribution is conjugate prior to Binomial likelihood, I decided to try it. I get this checkCompiledFunction::mainEval: CompiledFunction LogLikelihoodFunction has calls to MainEvaluate and may not perform optimally

Why?

After that it works fine:

chain = createMCMCChain[obj, {0.3}
iterateMCMC[chain, 10000]; (* Burn it in first *)

samples = iterateMCMC[chain, 10000];
Mean[samples]

{0.656597}

I have been going through the example notebook. It's pretty good but are you planning to write documentation. It would be good to have a one.

SjoerdSmitWolfram commented 3 years ago

Hi,

Thanks for your interest in my project. I actually didn't realise that that talk was floating around on Youtube, so that's an interesting discovery as well ;).

I'll take a more detailed look at your question later, but superficially the answer is that this warning is just that: a warning. The code does its best to generate and auto-compile the "LogLikelihoodFunction" and "LogPriorPDFFunction" properties of the inferenceObject and sometimes the compilation (i.e., the function Compile) doesn't completely succeed and bits of high-level kernel code remain embedded in the compiled code. In this case it's the binomial that somehow fails to be compiled, as you can tell by evaluating:

obj["LogLikelihoodFunction"] // CompilePrint

... 18 R6 = MainEvaluate[ Hold[Binomial][ R1, I4]] ...

It seems like even the factorial cannot be auto-compiled:

Compile[{{n, _Integer}}, n!] // CompilePrint

It's a bit of a surprise, I'll admit. This usually only causes performance loss, so it's nothing serious. If you want to iron this out for efficiency reasons, you can always write your own loglikelihood function using Compile and feed it into defineInferenceObject directly using the "LogLikelihoodFunction" property. I'll put this on the to-do list and see if there's a way to automatically compile factorials and the like, because that obviously shouldn't be too difficult.

As for documentation: unfortunately I just don't really have the time for it for the foreseeable future, but you're right. Ultimately it's just a pet project it do in my free time and I do hope to get around to it again sometime.

SjoerdSmitWolfram commented 3 years ago

Having thought about it a bit more, it seem like Gamma can be compiled, which would be a good alternative to Factorial:

Compile[{{x, _Real}}, Gamma[1 + x]] // CompilePrint

Let me see if there's a way to automatically convert factorials to gammas during compilation.

SjoerdSmitWolfram commented 3 years ago

I just drafted an update. The code will now automatically convert factorials and the like to gamma functions. Your example should now compile completely without warning.

adhikarirsr commented 3 years ago

Awesome! Thanks for doing this. It's working now. I am also interested in doing ML with this framework. I will play with your framework more and if I get stuck, then I will let you know. It seems like your example notebook already has lots of good documentation. May just explaining what the code is doing there would be helpful?

For example: In the code below for "PriorDistribution" -> {NormalDistribution[0, 100], ExponentialDistribution[1/100]} you have two distributions. It would be good to know what does that mean and why are using these two. It would be helpful for Bayesian novice like me. Also here: "PriorDistribution" -> {"LocationParameter", "ScaleParameter"} you have used two keywords LocationParameter and ScaleParameter. It would helpful to know other keywords like that for other distribution.

Questions: Is there a way to get the distributions of weights for neural network using your framework? When you are doing inference using dropout, you are sampling the network ... so does this mean that we can get the distributions of the weights? Thanks again

adhikarirsr commented 3 years ago

New issues:

It looks like MMA has deprecated a bunch of NN function ConstantArrayLayer::deprec: ConstantArrayLayer is deprecated and will not be supported in future versions of the Wolfram Language. Use NetArrayLayer instead.

NetInformation::obsalt: NetInformation is obsolete. Instead, use Information.

I do not know how to write a package or which file to edit to fix these issues. I would love to learn and help you maintain this package.

SjoerdSmitWolfram commented 3 years ago

Ah, thanks for letting me know. I'll fix it soon, but don't worry: that message is nothing serious. Everything should still work. The NN framework is in a state of constant flux, but the old symbols are still supported for now.

As for editing Mathematica packages: it's not too difficult. You can open the .wl files with Mathematica or a plain text editor like Notepad++. You can make your own edits to the code on your own git branch and test them by loading the package with the PacletDirectoryLoad line at the start of the example notebook. In general, I can highly recommend learning about WL packages and how to write them: it makes maintaining code so much easier than in notebooks. See also: https://www.wolfram.com/workbench/

adhikarirsr commented 3 years ago

Also, Is there a way to get the distributions of weights for neural network using your framework? When you are doing inference using dropout, you are sampling the network ... so does this mean that we can get the distributions of the weights?

I will take your recommendation seriously and learn how to write packages.

Thanks a lot

SjoerdSmitWolfram commented 3 years ago

There might be a way to get obtain a (pseudo)distribution over the network weights, but I don't know how. I just implemented the ideas from ML researchers like Yarin Gal. It's a good question; you could ask him, I suppose.

adhikarirsr commented 3 years ago

I trying to go through his thesis. Let's see if he has anything there. Naively thinking, we have to figure out a way to dealing with those zero weights because of dropout, which might be tricky. Maybe we can just ignore those zeros and then try to use something like learn distribution function to get the pseudo distribution. Anyway, thanks again!