BiologicalRecordsCentre / ABLE

Assessing ButterfLies in Europe project repository
2 stars 3 forks source link

wrong names? #574

Open chrisvanswaay opened 1 year ago

chrisvanswaay commented 1 year ago

@JimBacon @kazlauskis While checking online data of the 15 min counts I noticed this strange record (see screenshot below). It is in NL, but Glaucopsyche alexis does not occur in NL. The common name is given as klöverblåvinge, which I think is Swedish (not sure for which species, but it is not English or Dutch), and the Name as entered is icarusblauwtje. The last one is very plausible, but the scientific name of this species is Polyommatus icarus, or Common Blue in English. Not sure what went wrong where, but this should be checked. image

chrisvanswaay commented 1 year ago

PS Checked klöverblåvinge on internet, and indeed it refers to G alexis. Which is not the species which was entered (it should be Polyommatus icarus). I also don't understand why the Swedish name is given as the Common Name.

larspett commented 1 year ago

This is a bit worrying since it is similar to an error we experienced (issue:ed and now fixed) with Autographa gamma, which received a common name in German despite having had the correct common name provided in the general update of common names in Swedish

kazlauskis commented 1 year ago

Thanks, we're looking into this now.

kazlauskis commented 1 year ago

This user has recorded icarusblauwtje (taxon ID is 540250, occurrence link). This taxon is pulled into the app from the warehouse as such:

{
            "id": "540250",
            "taxon_group": "insect - moth",
            "taxon": "icarusblauwtje",
            "language_iso": "nld",
            "preferred": "f",
            "parent_id": "447515",
            "external_key": "EBMSM0000000009171",
            "preferred_taxa_taxon_list_id": "455403",
            "preferred_taxon": "Glaucopsyche alexis",
            "attr_id_taxa_taxon_list_194": null,
            "attr_taxa_taxon_list_194": null
        }

@JimBacon Can you or someone else on your end have a look at why: a) this common name doesn't match the correct scientific name, and b) why this is flagged as a moth instead of a butterfly?

JimBacon commented 1 year ago

What I find is that EBMS Moths list contains

Meanwhile, the EBMS Butterflies list also contains Polyommatus icarus and Glaucopsyche alexis. In the Butterfly list, the Dutch name for Glaucopsyche alexis is bloemenblauwtje.

Why are the butterflies in the moth list?

On further investigation, there are 536 butterfly species in the moth list (under the superfamily Papilionoidea), 393 are in the 'insect - butterfly' taxon group and are flagged as not for data entry while 143 are in the 'insect - moth' taxon group and are allowed for data entry.

All of these 143 were updated on 18/19th April of this year by @andrewvanbreda , coinciding with the creation of swedish common names for moths. You can find them in file 4 of https://github.com/BiologicalRecordsCentre/ABLE/issues/542#issuecomment-1513690768.

I conclude that the presence of butterflies in the list of Swedish moths went un-noticed and they were imported as moths. Since the butterfly names were already present in the list there was no warning raised on import. The consequence was that the taxon group was changed and they became available for data entry, along with 91 pre-existing common names in Dutch.

I will change all 143 offending taxa to 'insect - butterfly' and 'allow_data_entry = false' for both preferred (i.e. Latin) names and associated common names. This should prevent them appearing in the app and on data entry forms.

Occurrences made against these taxa should be reassigned to the corresponding taxa in the butterfly list once the update to the app has been deployed and users have updated.

JimBacon commented 1 year ago

The taxon group of the problematic records was updated by this query.

update taxa
set taxon_group_id = 104, updated_on = now(), updated_by_id = 2
where id in (
    select ttl.taxon_id
    from cache_taxon_paths tp
    join taxa_taxon_lists ttl
        on tp.taxon_meaning_id = ttl.taxon_meaning_id 
    where tp.taxon_list_id = 260 -- EBMS Moths
        and 208572 = any(tp.path) -- Papilionoidea
)
and taxon_group_id = 114

The allow_data_entry flag of the problematic records was set by this query

update taxa_taxon_lists
set allow_data_entry = false, updated_on = now(), updated_by_id = 2
where id in (
    select ttl.id
    from cache_taxon_paths tp
    join taxa_taxon_lists ttl
        on tp.taxon_meaning_id = ttl.taxon_meaning_id 
    where tp.taxon_list_id = 260 -- EBMS Moths
        and 208572 = any(tp.path) -- Papilionoidea
)
and allow_data_entry = true

To ensure consistency, I prompted a cache update with this query.

insert into work_queue(task, entity, record_id, params, cost_estimate, priority, created_on)
select 'task_cache_builder_update', 'taxa_taxon_list', ttl.id, null, 100, 2, now()
from cache_taxon_paths tp, taxa_taxon_lists ttl
where tp.taxon_meaning_id = ttl.taxon_meaning_id 
    and tp.taxon_list_id = 260 -- EBMS Moths
    and 208572 = any(tp.path) -- Papilionoidea
order by id;

Ready for you to pull, @kazlauskis. Pass back to me when complete so I can try to clean up the occurrence records.

JimBacon commented 1 year ago

@kazlauskis you may have already done this but let me know.

chrisvanswaay commented 1 year ago

See also https://github.com/BiologicalRecordsCentre/ABLE/issues/520 @kazlauskis

JimBacon commented 1 year ago

Version 1.21.0 (261), (updated on 8th June and in the app store today) still exhibits the problem. That is to say I can find icarusblauwtje in the list of moths which is the source of the problem.

image

CrisSevilleja commented 1 year ago

This issue is still remaining and accumulating records of Glaucopsyche alexis in the Netherlands when it is not present. At the moment, I can find 319 records of G. alexis when it should be Polyommatus icarus. I can see these errors are coming from the wrong relation of icarusblauwtje to Glaucopsyche alexis doing 15min counts.

What is the issue that cannot be fixed? those wrong records should be change to Polyommatus icarus. thanks

image

kazlauskis commented 1 year ago

We have fixed this in the app v1.22.0, but it was in Beta testing in August. All those records are coming from users with older app versions.

@CrisSevilleja, you probably have the latest app version, no? Can you confirm it is working as expected there?

JimBacon commented 1 year ago

I've just updated to the latest version of the app, v1.22.0 (298) of 11th September. Icarusblauwtje is no longer appearing in the moth list for me which is a good sign.

I was waiting to know the app was fixed before trying to correct the records.

kazlauskis commented 1 year ago

I would give it another month for more people to update their apps before correcting the species.

JimBacon commented 2 weeks ago

Prior to correcting the faulty occurrence records I examined the data and noted the following:

The update query was as follows:

drop table if exists ebms_occurrences_to_update; 

with butterflies_in_moth_list as (
      -- Find the butterfly taxa in the EBMS moth list.
      select cttl.id as moth_taxa_taxon_list_id, 
            cttl.taxon as moth_taxon, 
            cttl.preferred_taxon as moth_preferred_taxon
      from cache_taxa_taxon_lists cttl 
      join cache_taxon_paths ctp 
            on ctp.taxon_meaning_id = cttl.taxon_meaning_id 
      where cttl.taxon_list_id  = 260 -- EBMS Moths
            and 208572 = any(ctp.path) -- Papilionoidea
),
butterflies_in_butterfly_list as (
      -- Find the matching taxa in the EBMS butterfly list
      select biml.*, 
            cttl.id as bfly_taxa_taxon_list_id, 
            cttl.taxon as bfly_taxon, 
            cttl.preferred_taxon as bfly_preferred_taxon
      from butterflies_in_moth_list biml 
      left join cache_taxa_taxon_lists cttl
            on biml.moth_taxon = cttl.taxon
      where cttl.taxon_list_id = 251 -- EBMS Butterflies
),
ebms_occurrences_of_butterflies_in_moth_list as (
      -- Find the EBMS occurrences that need correcting
      select cof.id, cof.created_on, bibl.*
      from cache_occurrences_functional cof 
      join butterflies_in_butterfly_list bibl
            on bibl.moth_taxa_taxon_list_id = cof.taxa_taxon_list_id
      where cof.website_id = 118 -- EBMS
)
select *
into temporary table ebms_occurrences_to_update
from ebms_occurrences_of_butterflies_in_moth_list;

update occurrences o
set taxa_taxon_list_id = eotu.bfly_taxa_taxon_list_id,
      updated_by_id = 2,
      updated_on = now()
from ebms_occurrences_to_update eotu
where o.id = eotu.id;

insert into work_queue(task, entity, record_id, params, cost_estimate, priority, created_on)
select 'task_cache_builder_update', 'occurrence', id, null, 100, 2, now()
from ebms_occurrences_to_update order by id;
JimBacon commented 2 weeks ago

@chrisvanswaay Please could you check that the problem has been fixed and close this issue if okay.

CrisSevilleja commented 2 weeks ago

We have fixed this in the app v1.22.0, but it was in Beta testing in August. All those records are coming from users with older app versions.

@CrisSevilleja, you probably have the latest app version, no? Can you confirm it is working as expected there?

There are not more G.alexis records for the Netherlands, when P.icarus has been recorded recently. Thank you for fixing this.

chrisvanswaay commented 2 weeks ago

Thanks @JimBacon I will ask my colleague to check tomorrow, up to now I see no changes. Will inform you anyway of the outcome.