EIDA / eida-statistics

Aggregated statistics of EIDA nodes
GNU General Public License v3.0
0 stars 0 forks source link

Output of human example links #20

Closed jmsaurel closed 1 year ago

jmsaurel commented 1 year ago

Hello,

Thanks for this very nice webservice. Playing with the example links for human, I noted one question about the csv content.

The nb_reqs column appears always at None. Shouldn't it be at least the same number at the column nb_successful_reqs ?

Also, the country column is always showing *. Maybe this feature is not yet implemented ?

vpet98 commented 1 year ago

Hello,

The nb_reqs column appears at None for statistics of RESIF datacenter (and maybe of some others) because there is truly an empty value for such statistics in the database. The webservice code could be changed to give the same number as in nb_successful_reqs in this case, but maybe it would be wiser to udpate the database.

The country column is showing * whenever results are requested for more than one country. In the 4th example, where results are requested per country, thus each returned result object refers to only one country, the country column shows different values.

jschaeff commented 1 year ago

Yep, RESIF is the bad guy here. We only compute stats on successfull requests, and we don't count the other one.

I need to bang my head on the walls a couple of time and think about a solution ...

jmsaurel commented 1 year ago

The nb_reqs column appears at None for statistics of RESIF datacenter (and maybe of some others) because there is truly an empty value for such statistics in the database. The webservice code could be changed to give the same number as in nb_successful_reqs in this case, but maybe it would be wiser to udpate the database.

Ok, I understand. I would say the best solution would be that the datacenter places a value there, at least the same number as nb_successful_reqs. I'm not sure it's such a good idea to have the webservice not completely transparent.

The country column is showing * whenever results are requested for more than one country.

Hum, I see. The webservice is designed so that one can request the statistics for a specific or list of specific countries. Or to have the whole count. Am I correct ? I would like to request the statistics for all countries and have an idea of how my data are spread on the world. Would it be possible that country=* or country=?? matches all available countries from the database ?

For example, I would like the same output I get with this option with country=* (because I don't which countries are using my data and I don't want to specify the ~190 existing country codes).

curl "http://ws.resif.fr/eidaws/statistics/1//dataselect/query?start=2021-01&end=2021-12&datacenter=RESIF&network=G&station=FDFM&country=FR,EN,US&aggregate_on=month"
# version: 1.0.0
# matching: start=2021-01&end=2021-12&datacenter=RESIF&network=G&station=FDFM&country=FR,EN,US
# aggregated_on: month,location,channel
month,datacenter,network,station,location,channel,country,bytes,nb_reqs,nb_successful_reqs,clients
*,RESIF,G,FDFM,*,*,FR,2399797760,None,15048,12
*,RESIF,G,FDFM,*,*,US,104714240,None,1204,5
vpet98 commented 1 year ago

Hum, I see. The webservice is designed so that one can request the statistics for a specific or list of specific countries. Or to have the whole count. Am I correct ?

Yes, you are correct.

I would like to request the statistics for all countries and have an idea of how my data are spread on the world. Would it be possible that country= or country=?? matches all available countries from the database ? For example, I would like the same output I get with this option with country= (because I don't which countries are using my data and I don't want to specify the ~190 existing country codes).

Without specifying the country parameter at all in the request, it takes all countries available in the database. So, the request you want would simply be:

curl "http://ws.resif.fr/eidaws/statistics/1//dataselect/query?start=2021-01&end=2021-12&datacenter=RESIF&network=G&station=FDFM&aggregate_on=month"

The same holds for other parameters as well. Only for the network, station, location, channel parameters the user can also explicitly assign network=*, which is the same as not specifying network parameter at all.

jmsaurel commented 1 year ago

Thanks @vpet98 , it works much better without the network parameter ;).

I think I mixed up things with the aggregate option (I had aggregagte=country at some point which I think explains why I didn't saw the list of countries).

jschaeff commented 1 year ago

Thank you for your feedback Jean-Marie,

We need to make the documentation somewhate clearer about de behaviour of the aggregate_on praameter, I would love to get your ideas, as a user.

Cheers,