Open DanielTakeshi opened 9 years ago
Hi Daniel, This is good but I think you'll find for single or few samples, multinomial is much faster than multinomial2. If multinomial is correct, we can deal with this other issue later.
-john
On 5/29/2015 9:50 AM, Daniel Seita wrote:
John,
It looks like I didn't catch a few test cases when we were checking multinomial2. I found some odd behavior this morning. Just to recap, the definition of multinomial2 is:
|int multinomial2(int nrows, int ncols, float A, int B, int nvals)|
Where
- |nrows| is the number of rows of "matrix" |A|
- |ncols| is the number of columns of "matrix" |A|
- |A| is a pointer to a data matrix where columns correspond to an un-normalized probability distribution
- |B| is a |GIMat| of the same dimension as |A| and holds the sampling results
- |nvals| is the number of samples we want to get, where one sample should make a k-way decision.
Unfortunately, from what I can tell, the multinomial2 sampling is /ignoring/ a row and putting the results in a different row. To demonstrate:
|scala> import edu.berkeley.bid.CUMACH. import edu.berkeley.bid.CUMACH.
scala> val test4 = grand(2,1000) test4: BIDMat.GMat = 0.40568 0.11102 0.97801 0.96959 0.83026 0.40141 0.41202 0.61166 0.20048 0.22633 0.12287 0.18022 0.32572 0.40397 0.86652 0.31198 0.17791 0.56108 0.59852... 0.59785 0.32611 0.46033 0.17353 0.79053 0.91425 0.43692 0.55500 0.49750 0.96141 0.81040 0.62431 0.81548 0.96281 0.011548 0.73854 0.26540 0.56230 0.50416...
scala> val out4 = gizeros(2,1000) out4: BIDMat.GIMat = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0...
scala> multinomial2(2,1000,test4.data, out4.data,1000) res12: Int = 0
scala> out4 res13: BIDMat.GIMat = 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0... |
You can see that out4 allocates all of the 1000 samples to be the first value (first row), whereas since we randomized it, when we sum across each of the two rows, they should have roughly the same number of elements.
As another example with a larger matrix, which shows a more interesting result, we get:
|scala> val test6 = grand(100,1000) test6: BIDMat.GMat = 0.64936 0.32890 0.80571 0.060403 0.46713 0.11794 0.35396 0.51864 0.55048 0.31779 0.11267 0.84305 0.063050 0.16121 0.13459 0.15608 0.23206 0.75661 0.76700... 0.30496 0.52751 0.28042 0.15433 0.15183 0.54877 0.97555 0.73333 0.86240 0.63230 0.41277 0.32537 0.57536 0.73076 0.44918 0.69297 0.33638 0.043051 0.22322... 0.37311 0.080803 0.53221 0.45509 0.65039 0.92046 0.16711 0.18513 0.28504 0.79746 0.015594 0.57462 0.45046 0.76908 0.48837 0.58818 0.99301 0.95240 0.72313... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..
scala> val out6 = gizeros(100,1000) out6: BIDMat.GIMat = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..... scala> multinomial2(100,1000,test6.data,out6.data,10000) res16: Int = 0
scala> out6 res17: BIDMat.GIMat = 173 174 177 44 143 153 264 231 258 221 101 235 129 177 108 162 98 169 222 97 238 131 323 219 305 211 250 254 241 179 276 199 262 105 34 274 248 243 133... 87 14 87 111 128 172 31 38 38 155 4 99 96 147 89 123 194 171 133 220 47 110 116 80 120 155 85 186 80 115 163 120 64 195 8 8 59 49 157... 135 135 49 4 28 159 14 152 190 117 20 212 134 18 15 179 96 185 71 26 19 15 175 204 95 131 129 138 10 113 72 36 28 193 31 105 162 135 154... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... scala> val sums = sum(IMat(out6),2) sums: BIDMat.IMat = 202950 100015 99611 ..
scala> sums.t res18: BIDMat.IMat = 202950,100015,99611,100062,98564,98280,101494,99784,101114,99836,100234,100544,97378,98849, 99934,104187,101520,100814,100156,100427,100484,101568,99160,98448,102099,101257,100595,10 1254,99316,101159,101845,99685,99266,99291,100698,98455,98149,97856,103206,98644,99495,101 699,102006,98426,100417,99976,99024,102430,99809,99613,100951,102319,98455,101879,103332,1 02684,100135,97815,101072,101160,98194,100579,100051,98683,97189,102728,100597,98409,98188 ,97562,98516,99912,98672,97695,100228,101165,95438,101559,98860,99721,100042,99101,100166,1 01560,100659,99598,97924,100269,103065,98137,99056,99702,100359,98133,98068,102242,98344,9 8157,100587,0 |
What happens here is that it looks like the /first/ row "took" the samples from the /last/ row. The first row sums up to be about 200k, and the other rows (except the last one) are about 100k. Also, even if you disregard the last row, some of the output doesn't make sense. The fourth column, for instance, has 111 samples in the second row and 4 samples in the third. But when we look at the test6 matrix, we see un-normalized probabilities of 0.15 and 0.45, respectively, so how come the ratio is 111-to-4 despite the probability ratio being 0.15-to-0.45?
I strongly suspect that there is an edge case problem that's causing one row to lose its samples, and maybe fixing that will resolve the ratio difference I just observed.
— Reply to this email directly or view it on GitHub https://github.com/BIDData/BIDMach/issues/51.
I did some quick tests with multinomial
. It looks like that is working and the output makes sense to me; there is no issue with missing rows, and the samples appear proportional to their probabilities. For the Norm
parameter, I had to do sum(a,1).data
where a
is the matrix that contains our data (which is the test
matrices I had in the earlier example).
I will leave this issue open, and we can get back to multinomial2
later given enough time (or enough complaints).
Here's a sketch of the update code we talked about yesterday:
val intrans = input.t // transpose and convert to float (or double if CPT table size > 8 million) val fintrans = float(intrans); val firstElems = fintrans * wmap // wmap is sparse matrix which maps tuples of states to a linear CPT index val feOffset = firstElems + cptOffsets // add the offsets (cptOffsets is a row vector) of CPT tables in the global matrix val replOffset = feOffset * copymap // copymap is a sparse matrix with horizontal blocks of ones that makes copies of the single offsets for each state val stateIds = int(ReplOffset) + stateOffsets // add an int row vector of strided offfsets, one for each state. val rawProbs = CPT(stateIds); // Look up all the cpt values. val logProbs = ln(rawProbs); // take logs val totLogs = logProbs * ssum; // ssum is a sparse matrix of diagonal blocks which combines CPT vlues for each node from its ancestors and descendents. val totProbs = exp(totLogs); // Convert back to probabilities for (i <- 0 until numNodesInColor) { val block = totProbs.colslice(startInds(i), startInds(i+1)); // Get the columns for the states for one node val samples = multinomial(block, 1); val (vv, ii) = maxi2(samples); // Get the integer indices of teh samples intrans(?,colorNode(i)) = ii.t; // copy to the corresponding input column } input ~ intrans t; // Transpose back
Actually, you can combine cptOffsets into stateOffsets and save a step. This is also better because then you only need enough floating point bits to store the index into one CPT table. The matrices wmap, copymap, stateOffsets and ssum are all fixed ahead of tiem for each color group.
John,
It looks like I didn't catch a few test cases when we were checking multinomial2. I found some odd behavior this morning. Just to recap, the definition of multinomial2 is:
int multinomial2(int nrows, int ncols, float *A, int *B, int nvals)
Where
nrows
is the number of rows of "matrix"A
ncols
is the number of columns of "matrix"A
A
is a pointer to a data matrix where columns correspond to an un-normalized probability distributionB
is aGIMat
of the same dimension asA
and holds the sampling resultsnvals
is the number of samples we want to get, where one sample should make a k-way decision.Unfortunately, from what I can tell, the multinomial2 sampling is ignoring a row and putting the results in a different row. To demonstrate:
You can see that out4 allocates all of the 1000 samples to be the first value (first row), whereas since we randomized it, when we sum across each of the two rows, they should have roughly the same number of elements.
As another example with a larger matrix, which shows a more interesting result, we get:
What happens here is that it looks like the first row "took" the samples from the last row. The first row sums up to be about 200k, and the other rows (except the last one) are about 100k. Also, even if you disregard the last row, some of the output doesn't make sense. The fourth column, for instance, has 111 samples in the second row and 4 samples in the third. But when we look at the test6 matrix, we see un-normalized probabilities of 0.15 and 0.45, respectively, so how come the ratio is 111-to-4 despite the probability ratio being 0.15-to-0.45?
I strongly suspect that there is an edge case problem that's causing one row to lose its samples, and maybe fixing that will resolve the ratio difference I just observed.