bredelings / MCON

0 stars 0 forks source link

Specifying meta-data about the sample, such as generation number, chain number, particle number, etc. etc. #1

Open bredelings opened 12 months ago

bredelings commented 12 months ago

Q1. How should we flag keys like generation number and related indices in the header field?

For MCMC, the "iteration" or "generation" field is different than other fields. It is always increasing, and is meta-data about the sample, not part of the sample itself. Currently we can arrange to place that field first when converting to TSV, but it is otherwise not identified in the header.

If there are multiple MCMC chains, we could combine the samples by marking the sample with which chain it comes from.

For SMC, the "generation" field might indicate which distribution the sample is from, and we'd also need a "particle" number to distinguish samples.

Q2. We could have additional meta-information that is continuous. For example, with multiple heated chains, we could annotate each chain with a temperature.

bredelings commented 11 months ago

Suppose we have single-chain MCMC:

{"iter": 10, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}

Here, we could add "time": ["iter"] or "order": ["iter"]. The field is used to order the samples, as well as indicate the degree of sub-sampling.

Suppose we have multiple-chain MCMC:

{"iter": 10, "chain": 0, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}
{"iter": 10, "chain": 1, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}

Here, samples from different chains but the same iteration are sampling from the same distribution. The chain numbers don't change the distribution, and are not ordered.

Suppose we have MCMCMC:

{"iter": 10, "temperature": 0, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}
{"iter": 10, "temperature": 1, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}

Suppose we have multiple-chains of MCMCMC:

{"iter": 10, "temperature": 0, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}
{"iter": 10, "temperature": 1, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}

Suppose we have particle-based methods:

{"iter": 10, "particle": 0, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}
{"iter": 10, "particle": 1, "x": [1.1, 2.2, 3.3], "pi": {"A":0.3, "T":0.7}, "y": [1,2]}

For the moment, maybe just use "index": ["iter","particle"]