specify / specify7

Specify 7
https://www.specifysoftware.org/products/specify-7/
GNU General Public License v2.0
66 stars 36 forks source link

Allow configuring synonyms during initial imports #3534

Open grantfitzsimmons opened 1 year ago

grantfitzsimmons commented 1 year ago

Is there a way to configure synonyms through batch import of taxons to the tree (either through code or through formatting a table a certain way?). Dragging and dropping after a taxonomy tree is setup is doable, but would be a lot easier to just configure the synonym relationships from the get go if possible.

Requested By: Mark Pitblado on the Speciforum (on behalf of the University of British Columbia - Beaty Biodiversity Museum)

grantfitzsimmons commented 1 year ago

This is something that people have requested a number of times.

specifysoftware commented 1 year ago

This issue has been mentioned on Specify Community Forum. There might be relevant details there:

https://discourse.specifysoftware.org/t/trees-in-specify-7/534/4

specifysoftware commented 1 year ago

This issue has been mentioned on Specify Community Forum. There might be relevant details there:

https://discourse.specifysoftware.org/t/sp7-workbench-import-taxa-as-synonyms/1425/2

grantfitzsimmons commented 9 months ago

Mentioned as a major drawback by Greg Pohl during our meet with the Canadian Forestry Service

bronwyncombs commented 8 months ago

Requested by Ashley Ferguson at the College of Idaho

grantfitzsimmons commented 7 months ago

We need to upload a (long and fairly clean) list of new taxa prior to uploading the actual specimens/records that bear those names. I know many of those binomials are synonyms of others (specific ones) in that same list, where each synonym is unequivocally linked to its accepted/preferred name. However, I can’t figure out how to map that relationship in the WorkBench. Fields like Taxon.acceptedTaxon and Taxon.isAccepted (populating it would help down the line) are not available in the WB. But neither is Determination.preferredTaxon if one is to try an alternative route. Again, it’s a long list and synonymizing binomials one by one on the taxon tree would be painful. I must be missing something. Thanks.

Requested by: Iñigo from CSIC (Discourse)

specifysoftware commented 7 months ago

This issue has been mentioned on Specify Community Forum. There might be relevant details there:

https://discourse.specifysoftware.org/t/establishing-relationship-between-synonym-and-preferred-accepted-taxon-en-masse/1663/2

philippeverley commented 7 months ago

+1 for migrating CAY (French Guiana) collections to Specify. We have thousands of taxon to import. Taxon synonymy is perfectly identified in the source database, but we are going to loose this information in the initial taxon import, prior to the collection objects import. We'll have to think of a SQL based solution to automatized the process, while this feature is not available directly in WB7.

FedorSteeman commented 2 months ago

We recently encountered this issue again, after I handed a dataset over to @AstridBVW for her to import, which needed the collection object as the base table and included preferred names to the synonyms that are used for the taxon determinations. On of the unexpected discoveries was that we couldn't even map any column to the "IsPreferred" field, which otherwise would have enabled this import.

In the attached excerpt you will find a range of columns that represent the preferred name. Logically, we would want to map all these including the "IsPreferred" flag to the corresponding fields in Specify, but still can't.

SNM-HerpDatabase-Excerpt2-csv.csv

How far are you guys with the implementation?

grantfitzsimmons commented 2 months ago

From @melton-jason:

Hi everyone!

Thank you all for providing example datasets and being patient.

I have created a repository on GitHub which demonstrates how the API can be used with Python and the requests library to create an application which mass-imports taxonomic data (including synonyms) to a Specify 7 instance.

In short, the demo takes a CSV containing information in the following format, creates a Mammalia taxon node if one does not exist, and uploads the taxon records under the Mammalia node (at the correct ranks specified in the CSV columns)

Order Family Genus Species isAccepted Author AcceptedGenus AcceptedSpecies AcceptedAuthor
Afrosoricida Tenrecidae Microgale talazaci Yes Major, 1896
Afrosoricida Tenrecidae Oryzorictes talpoides No G.Grandidier & Petit, 1930 Oryzorictes hova A.Grandidier, 1870

By default, the application is set to connect to https://sp7demofish.specifycloud.org/ using the sp7demofish user and logging into the KUFishvoucher collection, so you can see it in action and independently make edits to the code/data and see the result without worrying about making changes to a live production instance. (If you plan on developing your own application or apopting the one in the repository, you can use this sp7demofish instance for API testing purposes. The data in the instance should be regularly wiped).

The code was developed to be minimum-viable product (demo) without optimization in mind, so optimizations can be made to the code. And/or host a Specify 7 instance locally and have the application connect to the local instance to improve performance.


:warning: If interested, please read the README of the repository

https://github.com/melton-jason/Specify7-Api-Demo/tree/main

If this is not helpful, or an alternative approach should be considered, a demo using SQL directly to accomplish the same task can be made.

https://discourse.specifysoftware.org/t/establishing-relationship-between-synonym-and-preferred-accepted-taxon-en-masse/1663/8