IDEMSInternational / epicsawrap

GNU Lesser General Public License v3.0
0 stars 0 forks source link

annual_temperature_summaries not returning data #47

Closed ChrisMarsh82 closed 8 months ago

ChrisMarsh82 commented 1 year ago

params: { "country": "zm", "station_id": "01122", "summaries": [ "mean_tmin" ] } error: Error in total_temperature_summaries(country = country, station_id = station_id, : \n 'mean_tmin' has been given in summaries but no data is given in definitions json file.\n"

If station: 16 or test_1 is used error: Error: 'OrdDict' object has no attribute 'columns'"

lilyclements commented 1 year ago

@ChrisMarsh82

To error 1 Danny set up station 01122 as a dummy station before we had the Zambia data. This station does not have any temperature data in it, so the code will not work here. This is why there's an error there saying _" 'meantmin' has been given in summaries but no data is given in definitions json file.\n"" -- because there are no temperature columns. You'll find this for the other temperature functions with this data.

Two options

  1. You can add temperature data into the dummy data and test with that (I can't see the benefit to this!)
  2. You can test using our non-dummy Zambia data, or test_1

Would you suggest a better error message if that one wasn't meaningful for you?

To error 2 I am not able to replicate this error. How are you running it? If I do the following in R, it works -

season_start_probabilities("zm", "16")
lilyclements commented 1 year ago

@ChrisMarsh82 I checked the second error further, it seems to be an issue with Python. Is it read into Python ok? I know that Stephen would request things to be in a specific format for Python (e.g., see #9 where he requests integer columns for Month). From a quick scan, this seems to be in the right format from what he used to request, but, there might be something I'm missing.

ChrisMarsh82 commented 1 year ago

@lilyclements The second issue is a bit confusing. If mean_tmin and mean_tmax are requested then the query works fine if only mean_tmin is requested then, although the dataframe comes back it looks slightly different and rpy2 cannot read it. (this is true for the annual and monthly summaries)

This is how the working dataframe looks

# A tibble: 67 x 4
# Groups:   station_name [1]
   station_name  year mean_tmin mean_tmax
   <fct>        <int>     <dbl>     <dbl>
 1 LUNDAZI MET   1956      NA        NA
 2 LUNDAZI MET   1957      13.4      27.3

and this is the non working dataframe looks

$mean_tmin
# A tibble: 67 x 3
# Groups:   station_name [1]
   station_name  year mean_tmin
   <fct>        <dbl>     <dbl>
 1 LUNDAZI MET   1956      NA
 2 LUNDAZI MET   1957      13.4

There is an extra $mean_tmin

Not sure if this helps as it is just how python interprets it. I tried to see if I could trace differences in the R code and I did find this in climatic_summary.R (rpicsa)

if (length(elements) == 1 && 
      elements == "obsValue" && 
      "describedBy" %in% names(data)) {
    element_names <- as.character(unique(data[["describedBy"]]))
    data <- elements_wider(data, name = "describedBy", value = elements)
    elements <- element_names
  }

Not sure if this is anything but it does show a difference between sending in 1 element compared to 2

lilyclements commented 1 year ago

@ChrisMarsh82 could this be related to issue #40? That's been on my radar for far too long (so long that it fell off my radar). It looks like both these issues are reporting a problem with the listing output. I'll get to this this afternoon - I'm fairly positive this will help the problem!

lilyclements commented 1 year ago

@ChrisMarsh82 this should now be fixed. I accidentally did this on my main, not on my subbranch so I couldn't do a PR, but the fix is outlined here. Can you try updating your packages and going again?

To update your package in R:

devtools::install_github("IDEMSInternational/epicsawrap")

Let me know how you get on. Hopefully this issue can be closed.