jimmyday12 / fitzRoy

A set of functions to easily access AFL data
https://jimmyday12.github.io/fitzRoy
Other
129 stars 27 forks source link

Refactor package function structure #131

Closed jimmyday12 closed 3 years ago

jimmyday12 commented 4 years ago

As the package has grown organically and we've added more functions on an as-required, the way in which various functions work is becoming inconsistent. It also started with me being fairly inexperienced in creating a package!

This has resulted in a fairly convoluted and confusing experience for users.

For example, to get player stats from Footywire, we use update_footywire_stats. A similarly named, but different behaving function also gets player stats get_footywire_stats. This get function is quite different in behaviour to the similarly named get_afltables_stats.

On top of being confusing as to which function to use, all functions have slightly different arguments, providing a confusing API for our users.

I'm thinking we could restructure the code to make a minimal set of 'key' functions exposed to users with consistent arguments that act as a wrappers for internal functions. We could then clean up a lot of the existing functions to be internal.

As a bit of brain dump, we might have something like

- get_fixture(source = “x”, comp = AFLM/AFLW, season = 2020, ...)
- get_results(source = “x, comp = AFLM/AFLW, season = 2020, ...)
- get_ladder(source = “x”, comp = AFLM/AFLW, season = 2020, ...)
- get_team_stats(source = “x”, comp = AFLM/AFLW, season = 2020, ...)
- get_player_stats(source = “x”, comp = AFLM/AFLW, season = 2020, ...)

Each of those high level functions could call the existing functions (which would become internal) but for a user, you'd just have to remember the high level function and the source you prefer. For example -

- get_fixture()
    - get_afltables_fixture()
    - get_footywire_fixture()
    - get_afl_fixture()

I'm sure we could look to other packages for good examples of this. I'll leave this ticket as a place for feedback and suggestions, as well as any dumping any research findings.

cfranklin11 commented 4 years ago

This is a good idea, and the example functions are a good start. A couple of questions:

Also, a note on naming: this might not matter in a functional language like R, but in object-oriented languages, the use of get as a function/method prefix is discouraged, because it's often used in the context of attribute getters/setters. So, I tend to see fetch used to indicate a function/method that gets data from an external source.

jimmyday12 commented 4 years ago

What sort of data do you have in mind for get_team_stats? Is it for in-game stats (e.g. kicks, marks) aggregated by team?

Yeah good question. This was my initial thought (basically take the player stats and aggregate them).

How important is the source of data? I'd be interested in hearing from others, because for how I use the data, the source doesn't really matter, so I'm thinking it's one of those implementation details you could hide from the end user

Yeah - I'm not sure it's super important from a perspective of "I like one source more than the other". I think where it becomes important though is that the different sources have different spelling for teams/players and sometimes different data update periods (e.g. footywire results update after each game whereas afltables update after each round).

So I think as a minimum, people need to be able to ensure they are pulling data from a consistent source (or if they aren't, it's a deliberate choice they make). But I take your point - perhaps a consistent default is ideal, with end user given the option to change if they like.

Also, a note on naming: this might not matter in a functional language like R, but in object-oriented languages, the use of get as a function/method prefix is discouraged, because it's often used in the context of attribute getters/setters. So, I tend to see fetch used to indicate a function/method that gets data from an external source.

Interesting - I must say without a programming background these kind of things are a bit beyond my knowledge but I'm happy to take advice here. Thinking about it from a pure functionality perspective, I actually think fetch makes more sense. It also allows a smoother transition from existing functions as we can just add these new high level fetch_ set of functions and slowly deprecate the get_ functions.

Thanks for the input!

jimmyday12 commented 3 years ago

This is now largely done and will be part of the upcoming 1.0.0 release. Closing for now.