jeancroy / RP-fit

MIT License
11 stars 1 forks source link

API to use DFO #4

Open RaenonX opened 1 year ago

RaenonX commented 1 year ago

API options TBD. Maybe start from listing out what's available to toggle first? In general, a class that contains fields as an option object could work and it's easy to maintain.

jeancroy commented 1 year ago

I was mostly referring to this

FitOptions.rounding.final_rp = RoundApprox.Exact
FitOptions.rounding.bonus = RoundApprox.Exact
FitOptions.rounding.period = RoundApprox.Exact
FitOptions.rounding.components = RoundApprox.Pass

rounding.bonus there's some strange interaction on pokemon with both energy recovery nature and gold subskill. This step multiply both and round the result of that multiplication. It is super weird but handle the observation correctly. It's turned on currently for both normal and dfo fit.

rounding.period. So we use flooring of the help per hours. 5 * floor( 3600 / final_period , 0.01) where final_period is the help time in seconds. There might be double flooring on that term 5 * floor( 3600 / floor(final_period,0.01) , 0.01) and enabling the toggle put double flooring. This toggle is currently disabled on the main fit.

Right now I think the double flooring result is strictly better. It does strange stuff to like pikachu level 7 with speed of help, but without the double flooring there's strange stuff with some of our worst data points. I can enable it on the main fit because there's no unknown value going into the computation of final_period. It should probably be turned on with gsheet and your formula.

rounding.components this was to explore the fact each of ingredient/berry/skill may have their own rounding, most likely down. What I like about that idea is that the period rounding overall push RP values up. And this push RP value down (if we use floor) so maybe they balance each other. So far, I don't like the result of that. Maybe there's still value in the idea, but not as I did it.

final_rp this model the fact that the end result RP is rounded to an integer value. It requires the least amount of change on the consumer side, you probably already do that rounding. But it completely destroy the derivative information. And that's why I tried to use derivative free method. The end result is also strictly better, but not by much and at a relatively large computation cost.

RaenonX commented 1 year ago

About double flooring, if that is actually strictly better, could you elaborate where exactly it floors? Can't tell from simply looking at 5 * floor(3600 / floor(final_period,0.01), 0.01) tbh. I don't have data point analysis come in handy, but I trust you lol

Implementation-wise regarding passing options, I think maybe can make a Python dataclass that contains the default values, then we can instantiate this dataclass to decide what options to turn on/off with the default?

Example implementation might be

@dataclass
class FitOptions:
  double_rounding: bool = True

And for instantiate

FitOptions()  # Use all default values
FitOptions(double_rounding=False)  # Override double rounding to false
jeancroy commented 1 year ago

Sorry I was using spreadsheet nomenclature because most of the research is done on spreadsheet, but 0.01 is two decimal places.

def floor(value, resolution=1.0):
    return np.floor(value / resolution) * resolution
jeancroy commented 1 year ago

I think maybe can make a Python dataclass that contains the default values, then we can instantiate this dataclass to decide what options to turn on/off with the default?

Honestly I'm still figuring out how to do shared settings in python. Requirements are that it can be shared and amount of boilerplate is relatively minimal. I'm not against type annotation.

I started with SimpleNamespace. Then I went singleton and implicit class property. Then you made explicit that FitOption is a class and everything is a class property. We can try data class.

FitOptions(double_rounding=False) 

I believe this create a new instance of FitOptions and overwrite default values on that instance. But it's not a sharing mechanism the way class properties are.

Timing is a major issue. Same as with setting the file path. The instance is already decided by the time we do import rp_model.calc.xyz.

That was the major benefit of the __dict__ overwrite approach I first proposed. It can be done as late as the last second.

jeancroy commented 1 year ago

Yeah dataclass does this

@dataclass
class C:
    a: int       # 'a' has no default value
    b: int = 0   # assign a default value for 'b'
def __init__(self, a: int, b: int = 0):

Those are just shorthand for code generation of an auto init function.

https://docs.python.org/3/library/dataclasses.html

RaenonX commented 1 year ago

Sorry I was using spreadsheet nomenclature because most of the research is done on spreadsheet, but 0.01 is two decimal places.

def floor(value, resolution=1.0):
    return np.floor(value / resolution) * resolution

I see. Where is this in the calculations btw? Not sure if that's something to put in the website calculations.

RaenonX commented 1 year ago

Yeah dataclass does this

@dataclass
class C:
    a: int       # 'a' has no default value
    b: int = 0   # assign a default value for 'b'
def __init__(self, a: int, b: int = 0):

Those are just shorthand for code generation of an auto init function.

https://docs.python.org/3/library/dataclasses.html

Yeah dataclass does exactly that, I am just not sure if it fits our need as we're actually using singleton in lots of places - I am fine with any pattern as long as the API call can be something like this:

from calc.rp_model import get_rp_model_result

def get_rp_model_df():
    # Configure settings

    return get_rp_model_result("results/{hash_id}.pickle")
jeancroy commented 1 year ago

I don't quite understand that requirement. What happens when you want to change more than one setting ?

I understand singleton is basically a global variable with a fancier name, but it was the only way I figured out to communicate from the notebook to the modules, and also share setting between notebooks.

RaenonX commented 1 year ago

I don't quite understand that requirement. What happens when you want to change more than one setting ?

I understand singleton is basically a global variable with a fancier name, but it was the only way I figured out to communicate from the notebook to the modules, and also share setting between notebooks.

Not sure if we're on the same page, but basically what is desired is to allow the following calls reflect the settings change:

def get_rp_model_df():
    FitOptions.something = "other stuff"

    return get_rp_model_result("results/{hash_id}.pickle")

To keep it singleton, I think something like this should work:

def get_rp_model_result(whatever_stuff):
    refresh_settings()
    ...  # Proceed calculation
jeancroy commented 1 year ago

Anyways I overstated the need to change settings.

Once I confirm the double flooring is better than non double flooring, and I enable it everywhere, there's not much to tweak. Or it'll become only R&D stuff that the api will not use... until it become mainline and everything use it.

I'll let this cool down... and maybe the boostrap stuff is more important. Or trying to improve the skill model.

RaenonX commented 1 year ago

Anyways I overstated the need to change settings.

Once I confirm the double flooring is better than non double flooring, and I enable it everywhere, there's not much to tweak. Or it'll become only R&D stuff that the api will not use... until it become mainline and everything use it.

I'll let this cool down... and maybe the boostrap stuff is more important. Or trying to improve the skill model.

Makes sense, I'd agree that bootstrap is more important than allowing the options since I am unlikely to tweak that options anyway