Min/max around known data points?

FoxFireX commented 4 years ago

Perhaps more of a question than an absolute issue. Once you've entered a few data points, the graph shows minimum and maximum values around those points you've already entered. Why is that? Shouldn't all possible models that don't actually match the values you've entered already be eliminated? I would expect to see nothing but one line (or really, four overlapping lines) in the sections where absolute truth has already been entered.

elxris commented 4 years ago

Yeah, this is right. I will make an adjustment, later today!

elxris commented 4 years ago

I think it doesn't look right.

It doesn't give you a sense of luckiness.

VS: As we have right now.

mmarquez76 commented 4 years ago

I like the way it shows currently (without the proposed change).

With the proposed change, you can no longer tell whether your actual price was above or below the average predicted price. On the other hand, with the current graph, you can see exactly how lucky (or unlucky) you got by seeing how far from the average your price landed.

FoxFireX commented 4 years ago

So let me ask a question then. If I blank out all data, for Mon AM I see this: I then fill in one value, Mon AM at 78, and I now see this: Why did the average change from 99 to 70? Why did the maximum change from 155 to 100?

My thought was that this graph was supposed to show all possibilities based on the data that I've already entered. How is it possible that based on a Mon AM price of 78 bells, the maximum price I could have on Monday AM is 100, and the minimum I could have is 35? Neither of those is possible, I know what the price was, it was 78, no more no less.

Now, I could see possibly arguing that it might be interesting to show those ranges as being what could have been, based on all data prior to that point. That's a fair argument to make, and I think is what was being suggested. But if you want to do that, then how could entering 78 possibly change the average and maximum values?

Basically, right now, I have no idea what the range around the absolute truth values is supposed to mean, and it honestly makes me question the validity of the remainder of the graph.

mmarquez76 commented 4 years ago

Now, I could see possibly arguing that it might be interesting to show those ranges as being what could have been, based on all data prior to that point. That's a fair argument to make, and I think is what was being suggested. But if you want to do that, then how could entering 78 possibly change the average and maximum values?

This part right here is a good point. It doesn't really make sense that the predictions for a day are changing depending on whether or not that day's data is known. This should be looked into a little further.

elxris commented 4 years ago

It's like with other tools.

If no data is known, the max, the min, and the average are calculated base on the possible patterns. If more data is known it will filter out the new not possible patterns for all week.

So, for a single day, it will make sense to get a max value being shortened while you're entering this new known piece of data.

What wouldn't make sense would be to see an increased maximum or decreased minimum.

mmarquez76 commented 4 years ago

I see, so the maximum and average decreasing are because of the not-applicable patterns being filtered out, and the new maximum and average are then calculated using only the currently possible patterns, right?

In that case, this doesn't seem to be an issue.

elxris commented 4 years ago

Yeah, that is correct. But you're free to check the calculations. They're under https://github.com/elxris/Turnip-Calculator/blob/master/src/v2/optimizer.js

FoxFireX commented 4 years ago

I see, so the maximum and average decreasing are because of the not-applicable patterns being filtered out, and the new maximum and average are then calculated using only the currently possible patterns, right?

See, this right here is the real crux of the issue. Not applicable patterns have been filtered out. What pattern could possibly have a value of 100 or 35 for Mon AM, when I have already told it that Mon AM absolutely has a value of 78?

elxris commented 4 years ago

Patterns work this way. Some random value is selected to determine what kind of pattern is. Small spike. Up, down, up, down. High spike. Constant decrease. Then, for example, if a high spike is selected some other random values are selected to determine when is going to happen this high spike. Then, the game asks the random number generator to give it a value between two floats. So, there are two types of random numbers, those who determine a branch in probabilities and those who determine a range in probabilities. Branches are calculated as separate patterns. For example, when a high spike happens on Tuesday PM, o when it happens Thursday AM. But ranges don't generate separate patterns, those are responsible for min-max values. If the spike on Tuesday PM can be a 600 bells spike.

FoxFireX commented 4 years ago

Okay, so I think I understand what you're saying this represents now. You're showing the ranges of all patterns which could fit the data, but then are showing the possible ranges within those patterns.

Let me ask this then, because I'm not as familiar with the logic and don't really have time to dig through it. (I'm working on about three or four other PRs for work right now, so the last thing I need is more code! :) )

For the various possible patterns, are future values independently picked from the possible values, or does the previous days' value impact it? For instance (and I'm pulling from mikebryant's tool, since it exposes the pattern numbers more easily for me) here's my current (rather disappointing) data: M: 78 74 Tu: 70 66 We: 62

I'll pull one line from the chart at random:

decreasing, spike, decreasing | 100 | 78 | 74 | 70 | 66 | 62 | 90..140 | 90..140 | 139..199 | 139..200 | 139..199 | 40..90 | 35..87 | 35 | 200

If this were to be the specific pattern line that my island is actually running on this week, are all values in the 90..140 range possible for this afternoon? Or does the fact that I've gotten 62 today mean there's actually a lower cap on what this pattern could produce? I ask because I'm a little worried that we might end up including values in the graph that aren't actually mathematically possible given the existing data. To prove that point, when entering those same values into both tools, mikebryant's gives me a maximum possible value on the week of 552, while this tool gives a max of 553. That obviously could be a rounding issue, but it also has a chance of speaking to the inclusion of values that are no longer reachable given what we actually know.

That's why the range around known values concerned me. It makes me feel like the algorithm still believes it's possible for those values to be different than what I already said they were, which makes me doubt whether all the projected values are actually possible based on my real data, or whether they're only reachable if I happened to be at the "maximum" range listed for what has already happened. (I hope I'm making sense here.)

elxris commented 4 years ago

Thank you for taking time to understand this!

So, you're correct, rounding issues are striking. And because of that, I made a change where each pattern is a bit thicker. I mean, I'm doing min-1 and max+1. Instead of adding precision to floats. (Js already do 64-bit float operations, while the source code is 32-bit), I'm doing this.

I have profs that this method works well for all edge cases I been stepped. Some cases mike's tool don't throw any pattern.

I haven't found a way to represent the same float operations that Niniji's code does.

I hope this is reasonable for you and solved all your doubts.

FoxFireX commented 4 years ago

So I did go ahead and look at the data mined code a bit, and actually, I'm a little afraid that what I was previously questioning may in fact be true. I'm not at all convinced that the "future" numbers are sufficiently tailored to the data that's already been entered.

From looking at the code, when you're in any sort of decreasing phase, the most that you can possibly go down in one period is 0.05 * basePrice. For sake of argument, let's assume a basePrice of 110 to give the maximum possible variation. That would come to a maximum decrease of 5.5, which again, for sake of argument, I'll round to 6 bells.

My current data, as a week one buyer, is as follows: Su ?, M 78/74, Tu 70/66, We 62/59, Th 56/?

In Mike's simulator, I believe it's working on the assumption that week one is locked into the Small Spike pattern. I don't think you have the same logic since you aren't specifically asking for it, so I believe I see the "Consistently Decreasing" pattern in the prediction.

My entered data has a most recent value of 56. By the logic of the data mined code, this afternoon's value CANNOT POSSIBLY be less than 50, and that's assuming the values that give the biggest possible difference period over period. So looking at the forward looking data, the minimum prediction for Thu PM is 44. That's simply impossible, based on the current data and the data mined algorithm.

If it's worth pursuing this, it probably should go in a different issue, but I wanted to raise what I found, since it was applicable to this discussion. I understand that you're showing the overall patterns, and think I'm okay with that in the historical side of things, but I'm more interested in what my possible future is, not what my possible future could have been were history different.

elxris commented 4 years ago

Thank you a lot for this. I now understand what you meant. I'm playing with the algorithm I wrote and I'm seeing good results.

elxris commented 4 years ago

@mtaylor76 @FoxFireX I think I have an experimental solution!

It's in a branch called patterns-v2. Could you try it out?

The data I have tested is working great, but I'm afraid that I have changed a lot of things to support this. Could you lend me a hand testing with datasets?

mmarquez76 commented 4 years ago

Have you pushed patterns-v2? I'm not seeing it online.

elxris commented 4 years ago

Sorry, I forgot.

FoxFireX commented 4 years ago

@elxris Is there a way to test it online, or would I have to build and deploy somewhere? I'm not honestly familiar with accessing branches through github.io.

elxris commented 4 years ago

I have deployed here temporarily: https://ac-turnip.com/test/

FoxFireX commented 4 years ago

My initial feeling is that it's closer to what I'd expect. I'd need to look at it a lot more closely to figure out if everything is spot on, but definitely closer. I notice this also ended up making the history part look much more like I had expected it to as well. There's still some range there, but it's much less.

mmarquez76 commented 4 years ago

I've tested it with my data as well, and everything still looks good. I like that it still shows the range of where you landed within your current pattern, without showing irrelevant data or confusing changes like we saw before.

However, using @FoxFireX 's test data, it still shows the minimum for Thursday afternoon at 49 (see below), while it's supposedly impossible for it to be any lower than 50. It's definitely a lot closer, so the error at this point might just be attributed to rounding, but it still might need another look.

elxris commented 4 years ago

Doing some math:

    // PATTERN 2: consistently decreasing
    rate = 0.9;
    rate -= randfloat(0, 0.05);
    for (work = 2; work < 14; work++)
    {
      sellPrices[work] = intceil(rate * basePrice);
      rate -= 0.03;
      rate -= randfloat(0, 0.02);
    }

Su ?, M 78/74, Tu 70/66, We 62/59, Th 56/? minBaseRate = 0.85 stepRate = -0.05

if sellPrice[7] = 59 then sellPrice[7] = intceil(59) = 59at edge case 59 = rate * basePrice

But what is the rate?
At worst it should be rate = 0.9 - 0.05 - 5 * 0.05 = 0.6 basePrice = 59 / 0.6 = 98.333

Then sellPrice[8] = 0.55 * 98.333 = intceil(54.08) = 55

I will see what I can do.

FoxFireX commented 4 years ago

Okay, one more thing I can point out, because I'm sure you're tired of hearing from me now. :)

My pattern is definitively established as the (week one) small spike pattern. Here's how it looks right now:

There's the initial decreasing pattern, but then we get into the spike range. That range is set up as two values which are between 90-140% of the base price, then three higher values. The higher values are assigned by picking a rate between 140-200% of the base price, then giving the middle of the three values that price, and the day before and after anywhere between 140% and whatever that value ended up being. Here's the relevant Ninji code:

    sellPrices[work++] = intceil(randfloat(0.9, 1.4) * (float)basePrice);
    sellPrices[work++] = intceil(randfloat(0.9, 1.4) * basePrice);
    rate = randfloat(1.4, 2.0);
    sellPrices[work++] = intceil(randfloat(1.4, rate) * basePrice) - 1;
    sellPrices[work++] = intceil(rate * basePrice);
    sellPrices[work++] = intceil(randfloat(1.4, rate) * basePrice) - 1;

Knowing that, it is possible to say that once you reach price three of the spike, it is impossible for price four to have a lower value. But in the graph above, it's treating price four (Sat AM) as having the full 140-200% range. I think it would be reasonable to adjust that price to have a minimum of whatever price three of the spike was. In my case, Sat AM cannot possibly be lower than Fri PM, at 170.

Similarly, once price four of the spike (Sat AM) is known, we can set the maximum for price five of the spike (Sat PM). That price cannot possibly be higher than price four of the spike.

(If this is getting way too picky, just let me know, and I'll back off. :) )

Elsensee commented 4 years ago

@FoxFireX Funny, I thought about the same, but didn't want to annoy, since the way this is, is totally fine! :D

However, I did notice another thing. I'm right now also in pattern 3 (decreasing, spike, decreasing), and the first value after the spike is calculated in your new code with randFloatRelative, even though it's not dependent on the previous (spike) value, which leads to incorrect ranges.

Vankog commented 4 years ago

Just a heads up. Just tried my data on /test mentioned above: Still no pattern/graph shown since the Friday am value: Screenshot_20200411-101857

What the heck is this pattern supposed to be? ^^ Maybe completely random?

Elsensee commented 4 years ago

@Vankog It could totally be pattern 0 - "high, decreasing, high, decreasing, high". Tue PM, Wed AM, Wed PM were in the first high phase. Thu AM, Thu PM, Fri AM the first decreasing phase Fri PM could have been the second high phase Sat AM, Sat PM the second decreasing phase

The real issue is your Fri AM price, because that's not really a possible value, as far as we know. If you omit it, things will work. (Not yet in the updated and deployed algorithm, because its calculated relative to the previous high phase, not the base price, but on the current deployed version as well as in mikebryant's version.) And the tools will correctly identify that pattern as pattern 0. The drop of 15 price points is just too high for any base price with our floating point arithmetics.

I tried to replicate this in C# with 32-bit-arithmetics (because I don't have a C++ compiler on my computer and I just hoped, that 32-bit-arithmetics would be similar anyway), but at no point did I ever reach a weird value.

Vankog commented 4 years ago

I see, using 35 as Friday am, it works, too. Weird ^^ In mikebryant's version I have to up it till 38 to be in the known range.

Elsensee commented 4 years ago

While @Vankog's data now work on the current 1.8-beta (glad to see that!!), it's a bit of a weird behaviour when entering data for Fr AM. Before entering the data for that date, the minimum is displayed to be 38. After entering, it is then adjusted to match the 34, even though this is below the predicted minimum.

elxris commented 4 years ago

Yeah, 1.8-beta introduced a new algorithm that shows currently known projections. Showing more “real” possibilities. But still can adjust when the real data is out of the chart by a little bit.

FoxFireX commented 4 years ago

Just wanted to say, this one's tracking a lot closer to what I was expecting/hoping for now. Really liking how it's shaping up. I'm happy to close this one when you are.

mmarquez76 commented 4 years ago

Closing for now due to request. Please feel free to drop us a line in the "chart doesn't match expectations" pinned issue if you see any other abnormalities. Thank you so much for helping us improve the calculator!

elxris / Turnip-Calculator

Min/max around known data points? #14