HcParams leads to a complicated interface

jatkinson1000 / archeryutils

A collection of archery code and utilities in python

https://archeryutils.readthedocs.io

MIT License

13 stars 3 forks source link

HcParams leads to a complicated interface #2

Closed LiamPattinson closed 8 months ago

LiamPattinson commented 2 years ago

It'd be good to hide a lot of the HCParams stuff, maybe by making it an optional argument to the functions that use it, and falling back to default values if it isn't provided. I think the ideal system would be something like:

from archeryutils import score_from_handicap, handicap_from_score, rounds

# Default to current AGB methods
myscore = score_from_handicap(rounds.york, 38)
myhc = handicap_from_score(rounds.portsmouth, 575)

Alternative handicap systems could be used by setting a context using fancy with methods, or as optional args to the standard functions. Or we could just have a global context setter:

archeryutils.set_handicap_system("some_other_method")

jatkinson1000 commented 2 years ago

Agreed. At present it just gets defined from a default file at the start and then passed all the way down. Therefore makes sense to turn it into a default argument such that it becomes optional for the user to define their own and pass all the way down only if desired.

The magic numbers shouldn't need changing, but I'd perhaps leave the option to set the handicap system, and the indoor and outdoor arrow diameter, as app developers/users are more likely to fiddle with these.

jatkinson1000 commented 9 months ago

@LiamPattinson after some thought on this I feel the easiest thing might be to make HcParams an optional argument, and any function using it spins up a new instance from the default values unless it is provided.

The idea of the context setter is programmatically nice, but I'm not sure how "pythonic" it is, and feel it would be harder for users to get their heads around.

I think I'd still leave the handicap system argument (but maybe make optional defaulting to "AGB").

The arrow diameter I am undecided on at the moment. It is currently included in HcParams data, but is the one thing a user is most likely to change. However, having it as an optional arg that overrides the HcParams value feels a little clunky, and I'm not wild about the ambiguity of being able to set it in two places.

How does that sound/any thoughts/suggestions?

LiamPattinson commented 9 months ago

I'd argue that context setters are definitely 'Pythonic', but like many concepts falling under that label that doesn't necessarily mean they're a good idea! Having looked a bit closer, I don't think they are the best solution in this case, and I agree that it might scare away users who aren't so well-versed in Python.

The main issue I see with it is that the user needs to set up HcParams -- a fairly complicated class containing a lot of mysterious data, and then pass this to whatever functions need to use it along with the string hc_sys. The value of hc_sys determines which data are used, and most of it isn't needed for a given set of calculations.

As much as it pains me to say it, I think I'd recommend just going for the classic inheritance tree:

from abc import abstractmethod, ABC

class Handicap(ABC):

    @abstractmethod
    def sigma_t(self, handicap, dist):
        pass

    def sigma_r(self, handicap, dist):
        sig_t = self.sigma_t(handicap, dist)
        sig_r = dist * sig_t
        return sig_r

    def arrow_score(self, target, handicap, arw_d):
        ... # Can define this in base class

    def score_for_passes(self, rnd, handicap, arw_d = None):
            pass_scores = np.array(
                [
                    pass_i.n_arrows
                    * self.arrow_score(pass_i.target, handicap, arw_d=arw_d)
                    for pass_i in rnd.passes
                ]
            )
            return pass_scores

    def score_for_round(self, rnd, handicap, arw_d = None, round_score_up = True):
        round_score = np.sum(self.score_for_passes(rnd, handicap, arw_d), axis=0)
        return self._round_score_up(round_score) if round_score_up else round_score

    @staticmethod
    def _round_score_up(score):
        """Most schemes use plain rounding, new AGB doesn't. Override this in that class"""
        return np.around(score)

    # Also define functions from handicap_functions.py

# Define subclasses:

class HandicapAGB(Handicap):

    def __init__(self, datum=6.0, step=3.5, ang_0=5.0e-4, kd=0.00365):
        self.datum = datum
        self.step = step
        self.ang_0 = ang_0
        self.kd = kd

    def sigma_t(self, handicap, dist):
        return self.ang_0 * ((1.0 + self.step / 100.0) ** (handicap + self.datum)) * np.exp(self.kd * dist)

    @staticmethod
    def _round_score_up(score):
        return np.ceil(score)

To simply the user's interface, I like to define 'factory' methods, usually in the __init__.py file:

# handicaps/__init__.py

from .handicap import Handicap
from .agb import HandicapAGB, HandicapAGBOld

_CLASSES = {
    "agb": HandicapAGB,
    "agbold": HandicapAGBOld,
}

def handicap(hc_sys: str, **kwargs):
    try:
        return _CLASSES[hc_sys](**kwargs)
    except KeyError:
        raise ValueError(f"{hc_sys} is not a recognised handicap system")

# User code:

>>> from archeryutils.handicaps import handicap
>>> hc = handicap("agb")
>>> score = hc.score_from_handicap(round, 34)

This means the user only has to import one function instead of importing multiple classes individually.

There are probably better names for everything here -- the word 'handicap' in particular seems a bit overloaded. I'd recommend defining as much as you can in the base class, and where the schemes differ you can defer to small overrideable helper functions. You might also want to include an intermediate layer of abstract classes to group together the shared functions and data in AGB/AGBold and AA/AA2, but shallower inheritance trees tend to be more maintainable in my experience.

Regarding the arrow diameter, I think it would be best for users to pass that into functions themselves, since that feels like a feature of the archer rather than a feature of the handicap system. If the handicap schemes have a default arrow diameter built into them, maybe they could have that defined as a function on each class.

Hope this helps! No worries if you'd rather go with an alternative approach. Since it looks like you're already pretty deep into solving this issue, it might be better to hold off on my suggestions until after the initial release.

jatkinson1000 commented 9 months ago

:sob: I'll digest this a bit more, but something along these lines probably makes sense I think... I guess I know what I'm doing at the weekend. Thanks for taking a look and making suggestions - I assume you would prefer me to refactor and PR due to being busy, but if you want ownership let me know.

Also going to tag @TomHall2020 to see if he has any thoughts.

LiamPattinson commented 9 months ago

I'm a little too busy to work on it right now, but I'm expecting to have a lot of free time in a couple of weeks, so I'd be happy to work on it then.

TomHall2020 commented 9 months ago

Thanks @jatkinson1000

Agree arrow diameter should be easily adjustable for end use cases. eg a scoring app which does plots with accurate arrow diameters.
But also that it should still easily fall back to the defined defaults for a given handicap system. People also want to know what their "official" handicap is for a round, and all the systems make assumptions on arrow diamter for those official handicaps and classificaiton thresholds.
The other parameters I think will rarely if ever be touched by end users.

The HcParams class is pretty odd to me. The data it contains certainly needs to be provided as default to most of the functions as that is almost invariably how its called. But even then, trying to override one single parameter requires either reassigning after init, or recreating the whole sub-dictionary

hc1 = HcParams(agb_hc_data= {
    "AGB_datum": 6.0,
    "AGB_step": 3.5,
    "AGB_ang_0": 5.0e-4,
    "AGB_kd": 10.00365, # changed value in init
})

hc2 = HcParams()
hc2.agb_hc_data["AGB_kd"] = 10.00365 # changed value after init

Either way feels clunky to me, and the definition of the class with all those default parameters being assigned in lambdas is just kind of offputting as well as a potential user, especially as dataclass features arent really being used elsewhere in the code.

@LiamPattinson's idea of factoring out that data into the owner of the handicap calculation functions does help with that. If keeping the current structure it would be good to at least have the sub dictionaries be typed so that their keys can be autocompleted, or even make them their own minature dataclasses.

I think the inheritance route works and would also have the advantages of reducing branch points in arrow_score (choosing default arrow diameter), score_for_round (choice of rounding up or down), and especially sigma_t (where most of the hcparams data is actually used). I also agree that bringing the handicap_functions public methods into those specialised classes would be helpful as well, it becomes clearer what you want to import from where without understanding the internals of how handicap_equations and handicap_functions work.

Having the convience function to get the right class from a string input is essential, otherwise user code has to be all over the place with importing classes from different places and that is a right pain.

So with that in place, the workflow for a user calling an existing handicap system and overriding parameters might be:

#simple implementation as before with a dict lookup
from archeryutils.handicaps import get_handicap_system
hc = get_handicap_system("AGB")
hc.kd = 10.00365 #adjusted
hc.score_from_handicap("round", 34, arw_d=0.075)

#fancy version with kwargs being passed through??
hc = get_handicap_system("AGB", kd=10.00365)
hc.score_from_handicap(...)

I think version 2 here is kind of neat but also really unecessary, again I very much doubt the majority of people will mess around with the existing parameters for existing systems.

jatkinson1000 commented 8 months ago

I think we all agree to keep arrow diameter as an optional argument.

@LiamPattinson - I have been thinking about the inheritance approach and it has a lot of merits. Particularly the fact it would compartmentalise and tidy some of the branching logic.

However, the need to set hc = handicap("agb") and then use functions contained in hc still feels a bit clunky. I think it would be nice if if the function calls were explicit as to the scheme being used, i.e.

import archeryutils.handicaps as hc

hc.score_for_round(round, handicap, "agb")
# or
hc.score_for_round_agb(round, handicap)

There is also the issue that the functions in handicap_functions.py make use of a similar API and can't easily be brought under the handicap class/subclasses (more on this below).

Stepping back and reflecting on the above I think a sensible workflow would be:

Call all handicap functions with minimum args, a clear identification of 'scheme', and use default values.
- This 'scheme' could be set once as a variable for users in their code and used where needed, or be part of the function call.
- setting as an argument I think makes more sense when considering this in conjunction with handicap_functions.
Update the key parameters for a handicap scheme once, and once only, before calling functions.
- This would somehow need to modify the behaviour for the handicap_functions as well (which is why HcParams was always a top-down argument for everything)

To this end I think it makes sense if functions are overloaded and can be passed an argument that can be a Union[str, Handicap]. If a Handicap then use this directly, if str then initialise a Handicap of this type with default values.

Something like:

import archeryutils as au
from archeryutils import handicap_equations as hc_eq
from archeryutils import handicap_functions as hc_func

# Uses default values for the "AGB" system
score_from_hc = hc_eq.score_for_round(round, handicap, "AGB")
hc_from_score = hc_func.handicap_from_score(score, round, "AGB")

# Uses custom values for the "AGB" system
custom_sys = hc_eq.handicap("agb", kd=3.14159)
score_from_hc = hc_eq.score_for_round(round, handicap, custom_sys)
hc_from_score = hc_func.handicap_from_score(score, round, custom_sys)

I feel this keeps things simple for novice users, whilst also being fairly clean for technical users, of which there will be relatively few (see above).

Some notes:

hc_eq.handicap() would be the guard point against setting inappropriate schemes. This would then knock on as a guard when called as a part of any other function.
I would use the class inheritance approach as set out above as this should give tidy manageable code and clean up the branching

I don't think there is a need to override for the classification schemes since the other numbers involved don't make sense when not tied to the default scheme.

Further, it strikes me as odd that score_for_round and handicap_from_score come from two places (hc_eq and hc_func respectively). This made sense at the time of writing (and still does in terms of the internals), but to a naïve user I can see that this would be confusing. I think this is an opportunity to:

bring all code into one file, or
take the interface to score_for_round into handicap_func, or
expose things at the subpackage level and just require users to import archeryutils.handicaps as hc

of which I prefer the latter.

TomHall2020 commented 8 months ago

Agree that calling methods on a system feels less intuitive than a pair of simple functions exposed at the top level and passing the system by string reference.

Maybe then take arrow_diameter as an optional parameter on those functions, and if it is None allow the handicap system to set its own default value with whatever logic it wants (this already mostly happens in the arrow_score function.

def score_from_handicap(
    round: Round, 
    handicap: float # for a high level api I think this probably should be a single value and not a numpy array
    system: Literal["AGB", "AA"...]
    arrow_diameter: Optional[float] = None
) -> int:
    ... # get class, dispatch to its methods.

Would recomend calling the class System, HandicapScheme or something similar rather than just Handicap, as that sounds more like its a class to model a single handicap value in itself.

jatkinson1000 commented 8 months ago

Writing to note that this is now WIP by @jatkinson1000 as of today.

Small bump in the road from being quite as 'easy' as I thought above, but getting there.

Base class is indeed called HandicapScheme 😉

I will ask you both to take a look and review once I open the PR.

LiamPattinson commented 8 months ago

I can see advantages and disadvantages to the plain function interface and the class interface, and in some cases I'd prefer to use the latter. The issue I see with passing the handicap system as a string is that it can result in a lot of repetitive extra function arguments. For example, say you're trying to compare award schemes:

star_purple = hc.handicap_from_score(1400, wa1440, "AGB")
star_gold = hc.handicap_from_score(1350, wa1440, "AGB")
star_red = hc.handicap_from_score(1300, wa1440, "AGB")
star_blue = hc.handicap_from_score(1200, wa1440, "AGB")
star_black = hc.handicap_from_score(1100, wa1440, "AGB")
star_white = hc.handicap_from_score(1000, wa1440, "AGB")

rose_purple = hc.handicap_from_score(1250, york, "AGB")
rose_gold = hc.handicap_from_score(1200, york, "AGB")
rose_red = hc.handicap_from_score(1100, york, "AGB")
rose_blue = hc.handicap_from_score(1000, york, "AGB")
rose_black = hc.handicap_from_score(900, york, "AGB")
rose_white = hc.handicap_from_score(800, york, "AGB")

The extra "AGB" at every call doesn't really add any new information*, and it would be cleaner in this case to initialise a HandicapAGB class and call handicap_from_score from that instance. On the other hand, I agree that the plain function approach is cleaner if you only want to make one or two function calls.

A compromise would be to expose both interfaces to the user, and implement the plain functions as a thin layer on top of the class-based system:

def score_from_handicap(handicap, round, system, arrow_diameter = None):
    return handicap_system(system).score_from_handicap(handicap, round, arrow_diameter)

This is a fairly pedantic level of library design though, so no worries if you'd prefer to only expose the plain functions and keep the class interface hidden away as an implementation detail.

* I'm aware that there's already a 'cleaner' way to write this and also avoid repeating the function call and the round name:

star_purple, star_gold, star_red, star_blue, star_black, star_white = [
    hc.handicap_from_score(x, wa1440, "AGB") 
    for x in (1400, 1350, 1300, 1200, 1100, 1000)
]

I imagine most people would opt for writing it out in full though, as that's a pretty ugly 'one liner'.

TomHall2020 commented 8 months ago

The other option for users that want to keep it clean is just doing a Partial application.

# I quite like the comprehension approach here
agb_hc_from_score = functools.Partial(hc.handicap_from_score, system="AGB")
star_scores = {'purple': 1400, 'gold': 1350, ...}
star_handicaps = {star: agb_hc_from_score(score, wa1440) for star, score in star_scores.items()}

In general though I support keeping things open, so I don't see anything wrong with making the classes accessible for people to work with directly as well. The function is more a convience for interactive use or where users want to wrap it and dispatch on different handicap systems.

jatkinson1000 commented 8 months ago

Yeah, there's no perfect solution.

I think what you describe here is actually what I am doing - I'll get you to review my PR once I have things in place and push up.

jatkinson1000 commented 8 months ago

Quick play over lunch.\ Current interface looks something like:

>>> import archeryutils as au
>>> import archeryutils.handicaps as hc
>>> my1440 = au.load_rounds.WA_outdoor.wa1440_90
>>> hc.score_for_round(my1440, 37, "agb")
1110.0
>>> hc.score_for_round(my1440, 37, "aa")
572.0
>>> hc.score_for_passes(my1440, 37, "agb")
array([237., 276., 276., 322.])
>>>
>>> # Or, using the class approach for advanced users:
>>> agb_hc = hc.handicap_scheme("agb")
>>> agb_hc.score_for_round(my1440, 37)
1110.0

However, restructuring the tests is going to be a nightmare 😭 Packaging __init__ interface is also proving a little tricksy.

jatkinson1000 commented 8 months ago

Notes to self until PR is opened:

[ ] Standardise round_score_up, rounded_score, int_prec to a single argument, probably rounded.
[ ] Move rootfinding etc. to be protected static methods of HandicapScheme Class.
[ ] Decide if full docstrings should be supplied in Parent or child classes or both.
[ ] Decide what is documented where - perhaps need API Docs and Developer Docs?
[ ] Standardise argument order: (handicap, round) or (round, handicap)? Same for (score, round).