mcglinnlab / soar

SOAR: species occurrence aggregator in R
2 stars 0 forks source link

User can download all records for a given taxon on inat #7

Closed dmcglinn closed 6 years ago

dmcglinn commented 6 years ago

So this may change fundamentally how our app works. It is quite easy to download all the records for a species by using this interface: https://www.inaturalist.org/observations/export?verifiable=true&page=1&taxon_id=5305&preferred_place_id=1&locale=en&user_id=

Essentially we would just have to convert the queried name into the inat taxon_id replace that in the above url, for example to see this for red oaks use:

https://www.inaturalist.org/observations/export?verifiable=true&page=1&taxon_id=49005&preferred_place_id=1&locale=en&user_id=

That will generate a dump of all the records including images if they are requested. For example here is the download link to all the red oak records: https://www.inaturalist.org/attachments/flow_task_outputs/718769/observations-27221.csv.zip?1519932984

For some species this will take a long time to generate the dump. The api that we have been using to access the data is fast and light but it tops out at a maximum number of records. However, if a user want all the records then taking this alternative approach is warranted. It would seem if the user wants all the records then we retrieve the taxon_id for them and say go to this webpage to create your data download. Then place the download link in this box - which R would then import and process for the user.

AshleyWoods commented 6 years ago

That could be one of the tabs? Could we have a tab that gives more fields to fill out? One field to ender the common name and enter the download link, and a second field to give the taxon_id and give the link.

dmcglinn commented 6 years ago

I think looking up the taxon_id based on the taxon name is the best idea. As this is an inaturalist internal number no user's are going to have this knowledge up front. Can you look into web scraping tools for R that will allow us to fill out the webform for the user.

AshleyWoods commented 6 years ago

Possible Tools: rvest seems to be a popular one.

Beginners Guide: https://www.analyticsvidhya.com/blog/2017/03/beginners-guide-on-web-scraping-in-r-using-rvest-with-hands-on-knowledge/

Rstudio page for rvest: http://blog.rstudio.com/2014/11/24/rvest-easy-web-scraping-with-r/