gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

I used the rgbif occ_download() function for an ongoing research project. I did not, however, save the DOI for each individual download. My understanding is that I could combine the CSV output files to create a derived dataset with a single DOI (for each of citing - we have 139 species in the study). Is it possible to either a) track down those original DOIs associated with my downloads, or b) create a derived dataset without them? Any help would be much appreciated! #3530

Open gbif-portal opened 3 years ago

gbif-portal commented 3 years ago

I used the rgbif occ_download() function for an ongoing research project. I did not, however, save the DOI for each individual download. My understanding is that I could combine the CSV output files to create a derived dataset with a single DOI (for each of citing - we have 139 species in the study). Is it possible to either a) track down those original DOIs associated with my downloads, or b) create a derived dataset without them? Any help would be much appreciated!


User: See in registry System: Chrome 91.0.4472 / Windows 10.0.0 Referer: https://www.gbif.org/faq Window size: width 1600 - height 757 API log&_a=(columns:!(_source),filters:!(),index:'3390a910-fcda-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) Site log&_a=(columns:!(_source),filters:!(),index:'5c73f360-fce3-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) System health at time of feedback: OPERATIONAL

silknets commented 3 years ago

Contact is silknets@vt.edu or via this GitHub account!

dnoesgaard commented 3 years ago

Hi there,

If you used occ_download(), you should be able to locate the relevant downloads and their DOIs in your account: https://www.gbif.org/user/download

If that isn't the case, or if you simply wish to combine downloads regardless of the way they were obtained, you can indeed create a derived dataset. You simply need to summarize the combined dataset by the column datasetKey. More information is available here: https://www.gbif.org/derived-dataset/about

Let me know if you have any questions!

silknets commented 3 years ago

Daniel, thank you for the reply.

I provided incorrect information above, which seems to be the issue's root cause. I used the occ_search() function to retrieve GBIF records, and not occ_download() as I previously stated. The latter appears to require a login through GBIF, which explains why I don't see the downloads in questions in my GBIF downloads records associated with my profile.

Does the occ_search() query generate any DOIs or datasetKeys that I could track down, or would this require the query to be re-run as an occ_download()?

dnoesgaard commented 3 years ago

Occ_search() does not generate DOIs. If you still have the original data saved, does it have the column datasetKey?

silknets commented 3 years ago

I don't have the original data saved. I used occ_search(), then applied a series of filters to this data for each species before saving the CSV output. This output doesn't include datasetKey.

silknets commented 3 years ago

However, this data has retained the following columns if this is helpful: datasetName, occurrenceID, catalogNumber, institutionCode

dnoesgaard commented 3 years ago

Could you paste a couple of lines here?

silknets commented 3 years ago

Acantharchus pomotis_occurrence data.txt I couldn't upload the csv file, but here it is as a txt file. I'm also pasting the first 5 lines below:

<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

occurrenceStatus | species | decimalLongitude | decimalLatitude | year | eventDate | issues | geodeticDatum | countryCode | country | datasetName | occurrenceID | catalogNumber | institutionCode | locality | decimalLongitude.1 | decimalLatitude.1 | optional -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PRESENT | Acantharchus pomotis | -79.1819 | 33.87465 | 2021 | 2021-03-23T09:07:32 | cdround,gass84 | WGS84 | US | United States of America | iNaturalist research-grade observations | https://www.inaturalist.org/observations/71874817 | 71874817 | iNaturalist | NA | -79.1819 | 33.87465 | TRUE PRESENT | Acantharchus pomotis | -83.6579 | 30.46513 | 2021 | 2021-04-01T17:31:28 | cdround,gass84 | WGS84 | US | United States of America | iNaturalist research-grade observations | https://www.inaturalist.org/observations/75680450 | 75680450 | iNaturalist | NA | -83.6579 | 30.46513 | TRUE PRESENT | Acantharchus pomotis | -82.4704 | 30.2839 | 2021 | 2021-04-26T17:20:00 | cdround,gass84 | WGS84 | US | United States of America | iNaturalist research-grade observations | https://www.inaturalist.org/observations/77743228 | 77743228 | iNaturalist | NA | -82.4704 | 30.2839 | TRUE PRESENT | Acantharchus pomotis | -82.3384 | 30.36415 | 2021 | 2021-05-24T15:09:59 | gass84 | WGS84 | US | United States of America | iNaturalist research-grade observations | https://www.inaturalist.org/observations/80203305 | 80203305 | iNaturalist | NA | -82.3384 | 30.36415 | TRUE PRESENT | Acantharchus pomotis | -82.2869 | 30.29865 | 2021 | 2021-05-24T15:05:14 | cdround,gass84 | WGS84 | US | United States of America | iNaturalist research-grade observations | https://www.inaturalist.org/observations/80203259 | 80203259 | iNaturalist | NA | -82.2869 | 30.29865 | TRUE

dnoesgaard commented 3 years ago

Ok, thanks.

Would you be able to send me all the csvs? Probably easier by email, zipped (dnoesgaard@gbif.org). I'll see what I can do then.