IDEMSInternational / cdms.products

An R package for manipulating and analysing historical climatic data.
Other
0 stars 5 forks source link

Parameter names - consistency #34

Open lilyclements opened 2 years ago

lilyclements commented 2 years ago

Seen a few inconsistent parameter names between functions.

E.g.

  1. date_time vs date
  2. element vs elements
  3. station vs stations

I suggest we use: date, element, station.

@dannyparsons what do you think?

lilyclements commented 2 years ago

We use station_id in geoclim functions. Is this different to station?

isedwards commented 2 years ago

In case it's useful, each of these examples mean something very specific in Python, especially when working with databases and using object relational mapping: station would be an object (instance) that represented a row in the database with each database column as a property, e.g. station.name, station.latitude, ... , station.id.

It's very common to have a variable called station_id, of type Integer or String, that contains a value taken from a station.id.

I would expect elements and stations to be collections containing multiple objects of type element and station respectively.

date_time would be an instance of DateTime (a date with a time) whilst date would be a Date object without a time.

dannyparsons commented 2 years ago

Thanks @isedwards that's useful context.

In these functions, the first argument is a data frame and these extra arguments specify the column names in the data.

So I think keeping them short and simple would be good and in line with R best practice. So although they are strings specifying a column name, I don't think we should be adding something like _name or _col to all the arguments, although that would distinguish them more from "objects". It should be clear from the function signature and documentation that these are names of columns in data.

On those specific names:

For functions where station is used for facets, I prefer using station because it could be either station_id or station_name. For geoclim the station_id is required specifically. So I think that's the distinction.

Thanks for helping to get the consistency here @lilyclements

lilyclements commented 2 years ago

I think I prefer date_time to date because it will be a date with time in some cases

@dannyparsons should we have date_time as the parameter name in functions where the function accepts a date-time object, but doesn't necessarily make sense - e.g. dekad, pentad?

dannyparsons commented 2 years ago

Yes that sounds good. Pretty much everywhere that we use a Date could also be a Date-time as we're generally using it to extract components or as a time series sequence, which could include a time component too.