JuliaMath / MeasureTheory.jl

"Distributions" that might not add to one.
MIT License
391 stars 32 forks source link

Representing and using point processes #71

Closed gdalle closed 3 years ago

gdalle commented 3 years ago

Hi @cscherrer @mschauer Following up on our Slack discussion, here's an issue about the implementation of point process models and algorithms. Here are our main questions sofar:

  1. What is a point process realization? A tuple (locations, marks) or a measure?
  2. How do we represent a point process?
  3. Where would I put this code?
  4. Could it interface well with Distributions.jl inspite of the more general sample type?
cscherrer commented 3 years ago

Thanks @gdalle ! I don't have many definitive answers, but here are some thoughts...

We really need at least two different representations:

  1. An infinite sample from a point process, from which we can compute on a finite subset with no artificially-imposed bound
  2. A set of user-supplied observations, on which we can evaluate the log-density (assuming there is one that's computable)

For (1), I'm not sure about the concrete representation. I do think in general we need to have good support for infinite sequences (see https://github.com/cscherrer/MeasureTheory.jl/issues/65), and to be able to reproduce the same sampled values in different traversals. This means a given instance will always need to store the current RNG state, as well as an initial seed.

For (2), I'd suggest starting with some methods on AbstractArray observations. We can later add alternatives, like maybe a Dict in cases where there's lots of multiple observations of the same point.

The code can go in MeasureTheory, maybe in a new folder if it's multiple files. If it happens to end up with lots of heavy dependencies that aren't used elsewhere, we should consider a separate repo. But let's not worry about that for now.

I think it should be easy to make it usable with most of Distributions. This is mostly a matter of adding some methods.

cscherrer commented 3 years ago

@gdalle I've invited you as a collaborator, so you should be able to start a new branch. Please start branch names with something identifying you, like maybe gd- or gdalle-

mschauer commented 3 years ago

I mostly work with temporal and tempo-spatial problem you see. We have not fully worked out how to represent stochastic processes (temporal, spatial...) in MeasureTheory, you often want to think about them differently as just a distribution object. For Mitosis we have added Markov kernels https://github.com/cscherrer/MeasureTheory.jl/blob/master/src/kernel.jl - which generalise distributions, represent the transition distribution and not the joint law, I found them quite useful, and the code is just 20 lines or so!

I have a number of instances where I have worked with point processes and where I would think that having common infrastructure in MeasureTheory to express those concepts would help interoperability.

Maybe I list them here:

Sampling inhomogenous Poisson process: https://github.com/mschauer/Bridge.jl/blob/master/src/poisson.jl and inference https://github.com/mschauer/PointProcessInference.jl

Compound Poisson https://github.com/mschauer/Bridge.jl/blob/master/src/levy.jl also Gamma processes which is also based on a jump measure

and then the plan to add Poisson observation edges to https://github.com/mschauer/Mitosis.jl which would then immediately give something like an extended Kalman filter with Poisson observations.

gdalle commented 3 years ago

Thanks for all of these pointers! I just need to have a quick chat with my advisors regarding open source contributions, and I'll get back to you shortly with proposals :)

mschauer commented 3 years ago

:-) Tell them it is good for your academic networking.

gdalle commented 3 years ago

I set up a meeting with my advisors next Thursday, so I should have news by the end of the week.

In the meantime, @cscherrer I understand what you mean by representing an infinite sample lazily, but I fear it might be harder than with a mere time-indexed random process, since if you have a point process on R^2 for instance there is no natural order of enumeration. So we need a method to ensure we get the same thing if we query, say, the point locations in one hyperrectangle and then another, without it depending on the order of queries.

Btw, I'm not even sure what kind of regions we should allow in queries for spatial point processes. Any thoughts?

cscherrer commented 3 years ago

Ok, yes that makes sense. Let's take a step back. For the point processes you're interested in, ...

  1. What kinds of computations are possible?
  2. What parameterizations are interesting/useful?
  3. I assume a log-density would be defined with respect to counting measure on whatever support you're using. It that right?
gdalle commented 3 years ago

Hello there,

I had a long talk yesterday with my advisors, and I convinced them that it would be interesting to develop a package for point processes simulation and inference in Julia. In terms of academic benefits for me, they agreed it would be better to create my own package, something like PointProcesses.jl. However, I think it would be interesting, both theoretically and practically, to base it on the MeasureTheory.jl formalism, since a point process is basically a random measure.

For the purposes of my research, I am mainly interested in temporal point processes, represented using their conditional intensity function. However, if I am to develop such a package, it would make a lot of sense to include common spatial or spatio-temporal processes. My idea would be to design a unified modeling framework, and then implement the most usual processes (Poisson, Hawkes, etc.), along with (possibly custom) inference / learning algorithms.

As for the measure-theoretical formalism, I would really need to dive back into my point process books, but I think we could make it work in an elegant fashion, as long as something like a lazy infinite sum of Diracs is available in MeasureTheory.jl. In any case, this is not my most pressing project, more like a long-term objective for this spring and summer.

What do you guys think?

cscherrer commented 3 years ago

Sounds good! It will be exciting to see this develop.

gdalle commented 3 years ago

PointProcesses.jl is now up and running, and it uses MeasureTheory.jl syntax, so I'll close this for now