BirdsCanada / NatureCountsAPI

NatureCountsAPI
0 stars 1 forks source link

Allow short description in data requests #26

Closed pmorrill closed 4 years ago

pmorrill commented 5 years ago

From Denis' email ::

I just had a thought about the naturecounts package. It would be useful to allow/request that people include a short description of the reason why they are asking for data, and store this in the database (project_details field).

I think we could make this a requirement in R only, and not in the API. Paul, can you modify the API to accept that new field and add to the data_request table?

Steffi, once Paul has set that up, could you modify the R package so it’s a required element, EXCEPT if they are using an existing request ID. Not sure what we do with people with dataset permissions. I would be inclined to require the details too for them.

I expect that some people (many?) won’t be entering anything useful there, but hopefully, enough people will. We could give them some simple examples (“COSEWIC report”, “Impact Assessment Study”, “School project”, etc.), but I would leave this pretty loose. As long as they provide something, even if it’s garbage.

Let me know if you think this poses too much of a barrier on users, but I’m hoping this is a reasonable compromise, and an incentive for people (e.g. CMMN stations) to raise their data to level 5.

Thanks!

D

steffilazerte commented 5 years ago

Just for data downloads, though, right? Not for queries related to number of observations etc.

denislepage commented 5 years ago

Correct, just downloads.

steffilazerte commented 5 years ago

I've added an info argument that is required unless the users supply a valid request_id. It probably doesn't match the parameter expected by the API, but I can add that in when it's ready.

Right now, all the naturecounts package examples, vignettes and tests use nc_example, nc_vignette, or nc_test which should tell the API when it's being accessed by either me and my testing, CRAN testing and building (if/when we submit to cran), travis ci/appveyor (if/when I set up remote testing), and users trying out the examples or vignettes.

pmorrill commented 5 years ago

Using a parameter called 'info' is fine with me. I have been with family to 2 days, but will get the api set to receive this next week sometime.

denislepage commented 5 years ago

Do you use specific naturecounts user accounts for testing? If those are being saved in the data request table, it would be good to have a way to flag them when we calculate metrics (e.g. number of requests per collection, etc.).

steffilazerte commented 5 years ago

I use the "sample" account for all testing units and examples, and I imagine that users would use that account when they're playing around too.

pmorrill commented 5 years ago

I have added support in the server for an 'info' parameter, but I am not clear which api entry-point will receive this new parameter. The sequence we use is first to query via /list_collections, then to query via /get_data. The latter call requires a requestId. So, can I assume you are sending in the 'info' field value as part of the /list_collections call?

pmorrill commented 5 years ago

Looking at the R code, I think that you are delaying and sending it in as part of the /get_data call (nc_coll_dl). That's ok - I can work with that as well. But it does mean that the same info value is submitted for each collection queried. On my side, the DataRequest object that carries the info field is built on the initial /list_collections query. It might be more efficient to submit the 'info' field at that time (nc_count_internal).

If you decide to switch it, the code I am putting in place today will support it that way as well.

steffilazerte commented 5 years ago

Yup that's what I was doing. Makes sense to supply it as part of list_collections, but can we make it optional on the API? That way I can make it required in R, only if users actually go through with the download. If users use list_collections just to see counts, and don't actually download the data, they don't have to supply a request_id. Would that work? It'll be a couple of days before I can make changes in any case.

pmorrill commented 5 years ago

Fair enough: that's a good argument for a delay in sending it. It is in fact optional in the api, so if you have a way to stick-handle it within the interface as you describe, that should work.

steffilazerte commented 4 years ago

I forgot about this until I started housekeeping the issues to day

I've moved the info parameter to be sent as part of the list_collections call when the data download is initiated.

I ran a test download with my user: steffilazerte and the info "testing info parameter". Did it work?

If so I think we can close this issue.

pmorrill commented 4 years ago

D you know the requestId you were using? That might be the easiest way to track it down.....

steffilazerte commented 4 years ago

Yup: 156361

pmorrill commented 4 years ago

OK. will look for it tomorrow......

pmorrill commented 4 years ago

This seems to have worked... I see 'Testing info parameter' in the project_details field of request_id = 156361. Good