KonaeAkira / raphael-rs

Crafting rotation optimizer / macro generator for Final Fantasy XIV
https://www.raphael-xiv.com/
Apache License 2.0
67 stars 14 forks source link

Adversarial sim and Tricks of the Trade #51

Closed periodically-makes-puns closed 2 months ago

periodically-makes-puns commented 2 months ago

Attempts to resolve #27 by implementing an option for the simulator that assumes the worst possible condition for quality purposes

This implementation assumes that there is never any case where an Excellent proc on a quality action is worse than not proccing a status at all. Specifically, under no circumstances should there exist a sequence of two consecutive quality-increasing actions where the second is more than 6 times more potent than the first. This would make it "better" for the adversarial simulator to make the first action Excellent and the second Poor, instead of leaving both Normal.

Under current circumstances, this should never happen, as the most potent quality action is Byregot's at 300% (10 IQ) and the least potent quality action is Refined Touch at 100%. This would be a difference of 180% (100%, 8 IQ) -> 600% (300%, 10 IQ), or about 3.33x. This assertion is currently not baked into the code in any way.

Under this assumption, we can define a new "pseudo-effect", call it, say, Guard. Whenever a quality-increasing action is used, it grants Guard for 1 action, ensuring that quality actions are executed with Normal condition. Quality-increasing actions used without Guard are assumed to execute with Poor condition.

This is, however, insufficient to fully model condition. It is possible for the solver to attempt to execute two un-Guarded quality actions within two steps of each other (e.g. Observe -> Advanced Touch -> Great Strides -> Byregot's). This model would place an Excellent proc on the Observe step and on the Great Strides step, despite it being impossible to go from Poor to Excellent. So in general, we may have some long sequence of un-Guarded quality actions, of which we can only "take" (force to Poor condition) some of. Specifically, we must only "take" un-Guarded quality actions which are at least 3 steps apart.

Because of this condition, we may in effect treat any un-Guarded quality actions at least 3 steps away from the nearest un-Guarded quality action as entirely disjoint. Thus the issue arises when we consider a series of un-Guarded quality actions two steps away from each other. (e.g. spamming the Observe -> Advanced Touch combo).

This can be solved with DP; Given a streaming array of nonnegative integers (representing the difference between Normal and Poor), find the maximum sum of numbers taken if you can only take non-consecutive numbers. This is implemented through the new unreliable_quality: [u16; 2] field on SimulationState. This is now sufficient.

Now for Tricks of the Trade. I wanted to model its use as pointed out in #27. I opted to model Tricks as a worse Observe; it ticks all statuses, but does not grant durability from Manipulation, since it is possible that Tricks did not proc. However, it does ensure that the action following it grants Guard. In order to prune the search tree more aggressively, I defined and forced Tricks to combo into either Innovation, Great Strides, or Observe, since it is only effectively used as a quality action.

Current issues:

periodically-makes-puns commented 2 months ago

Everything except additional solver tests and the condition bruteforcer should be implemented in that latest commit.

periodically-makes-puns commented 2 months ago

Also, I suspect the second issue I listed (unreachable being called) may be due to the solver running out of memory, which is probably tied to the third issue (the upper bound solver visits significantly more states during adversarial calculation, maybe?).

NotRanged commented 2 months ago

If it is impossible to reach the quality target under the adversarial sim, the program will (eventually) run into a RuntimeError (unreachable called).

In this case, some sort of progress or status bar may be nice to have? The current rotating half-circle showing the user that something is happening is nice, but visual feedback indicating how long they may have to wait max would go over well.

In general, this may be nice for this tool; the older crafting solvers (like https://dazemc.github.io/ffxiv-craft/#/solver ) had great visual feedback where you could see it at work 'improving' your craft and making progress. While this works nicely for their genetic algorithm approaches, this may not be as viabe to implement in this tool. But to me (and many others) there was a big charm to watching the tool work its way to a solution.