sfg-taxonpages / orthoptera

0 stars 0 forks source link

Move citations that contain (c) to Attribution #84

Open mjy opened 2 months ago

mjy commented 2 months ago

Nearly all images contain a Citation with a Source that has (c) in the title. These data need to be represented in Attribution to properly align with the broader semantics and copyright based filtering that may emerge.

typophyllum commented 2 months ago

Already a couple of months ago I was surprised to find SFS data sources converted to sources in TW, many of them with a considerable number of "citations" (greatly inflating statistics). Would it be possible to make a script that can convert sources containing © into attributions (or perhaps one that can create attributions based on these sources and another to then delete the latter)?

imagen

MMCigliano commented 2 months ago

@LocoDelAssembly , can you check this isssue please?

klausriede commented 2 months ago

I suppose in some table there is a field differentiating between references, media items (sounds, pictures) etc. Just count them separately for the statistics. In addition it might be interesting to differentiate between references containing revisions, original descriptions, connected to a type etc

mjy commented 2 months ago

Just count them separately for the statistics.

Yes (citation_object_type is the field) you could internally track this, but the point is that we won't internally build code to do this when another more accurate representation of the data is in place. We don't want to have to tease-out project-specific uses of the data ("we used these fields to record this because we can") by writing more code, we want to write code that reflects how data are represented so that all projects can benefit ("we put our attribution data in the Attribution bin, therefor when TW adds Attribution sumarizing code we can take advantage of it").

typophyllum commented 2 months ago

To avoid complications it would possibly be sufficient to convert the names of the photographers into attributions (creator or rather copyright holder) and lose the institution names (wich are identical with the repositories of the photographed specimens).

mjy commented 2 months ago

I'm sure this can be batch-fixed @typophyllum, the data are clean.