SpeciesFileGroup / taxonworks

Workbench for biodiversity informatics.
http://taxonworks.org
MIT License
87 stars 26 forks source link

Follow up from last week's digitization meeting: specific use case #2665

Closed kandakoj closed 2 years ago

kandakoj commented 2 years ago

This issue stems from the previous week’s taxonworks meetings, where I asked about the best way to export specific bits of data from my taxonworks project. @mjy asked me to post what exactly I wanted to export; maybe I just need to read up more on stringing together and filtering results from API calls.

As a user, I have a set of OTUs in an observation matrix. For each of these OTUs, I want to export the following bits of data related to it for a print catalogue. The limits to the project isn't a specific taxa and all its descendants or taxa in a specific asserted distribution; I want the export limited to data related to OTUs in the observation matrix.

  1. The taxon name tied to that OTU. For this, I want: -The original combination Genus species subspecies (if present) author, year: page

    • If the current combination is different from the original, the current combination and the original citation for the transfer. -If the taxon name tied to the OTU is invalid, just the specific relationship/reason why it is invalid. (e.g., Subjective synonym of XXXXX in XXX, XXXX)
  2. For the taxon name tied to the OTU, there is a digitized collection object; a type specimen. For this, I want:

    • The specimen catalogue number
    • Buffered collecting event
    • Buffered other labels
    • Type of type
    • Source (if one is associated with the type)
  3. Back to the observation matrix, I have some columns with data to export:

    • sex (qualitative)
    • higher grouping (qualitative)
    • Published photos (free text)
    • Remarks (free text)
kandakoj commented 2 years ago

Followed your advice; "sex" now included as a biocuration attribute in my project.

mjy commented 2 years ago

@kandakoj, almost have this, but need some clarification:

kandakoj commented 2 years ago
  1. I meant the citation for the combination that is currently used; if you were running tests on my data, I don't know if I have actual examples where I included the citation for the first use of the currently accepted combination.
  2. For this I meant citation on a typematerial (e.g., for subsequent lectotype, neotype designations); again haven't digitized the data for this in my dataset.
  3. Sorry I hadn't created some of the columns yet. Anyways, there is some data in the following columns:

Higher groupings: 1409; Published photos: 1482; Remarks: 1453; Sex I added to as a biocuration group in my project.

mjy commented 2 years ago

@kandakoj script is up. I don't think it's perfect, but it should give you something to chew on. See if you can find your way through https://github.com/SpeciesFileGroup/taxonworks_api_scripts.

mjy commented 2 years ago

The new endpoint used here is /taxon_names/123/status. It will likely be further namespaces /taxon_names/123/inventory/status in the future.