BiologicalRecordsCentre / ABLE

Assessing ButterfLies in Europe project repository
2 stars 3 forks source link

Swedish moth names #542

Open larspett opened 1 year ago

larspett commented 1 year ago

Last spring we provided a list of Swedish day active moths (https://github.com/BiologicalRecordsCentre/ABLE/issues/417) that I believe was deployed but none of those seem available.

Also, I wonder if we did provide a full list of the Swedish macro moths? If not, we can easily do that, currently we only see moth names in German in the moth part of the app. In the butterfly parts of the app, only butterflies are available despite the day active moth option being enabled.

If you want us to provide a moth dictionary, what format and which taxonomic groups should I provide?

larspett commented 1 year ago

I have a list with corrected names where the Swedish scientific names don’t match the ebms ones 1:1. I can upload that later today if those are ok with @andrewvanbreda. Maybe @andrewvanbreda wants to check first?

andrewvanbreda commented 1 year ago

Hi @larspett,

If you upload it to this thread, I can look into getting it uploaded to the Warehouse (depending on David/Jim's answers below). I always review content as best I can before I do any upload so I would check first, this would purely be for spotting obvious typing mistakes. I also need to make sure the format is ok for import.

@JimBacon @DavidRoy We do have a problem here (and something I was going to bring up with the Dutch upload that needs to be done). Currently I can't use the importer without running into the problem we had last time, where adding new common names will set existing ones to deleted. If it is done like this, we would need to use a modified version of the correction script I used after the first import. I spoke to John, and the importer should be able to handle this, however when I have tried it, it throws errors in the situation where there is an existing common name (when tested on my own machine). I am not sure you preference. I think the options are likely:

  1. I use the importer, but use the similar fix I used last time to correct the database.
  2. Jim to import the data directly into the database (I only have read-only warehouse access).
  3. Wait for me to correct the importer and then do the import. Even in this scenario, it is a bit tricky to do an import, as the existing common names need to be pulled into the import spreadsheet so they don't get overwritten. So I spoke to John yesterday about this, and I will add a new column to the importer for adding a common name (that doesn't overwrite the field). However again, this needs me to implement, and am heavily loaded at the second. (In particular trying to making quite a few Indicia sites 8.1 compatible, which I think is fairly urgent).
DavidRoy commented 1 year ago

I favour option 2. As you say, the update to PhP 8.1 is more urgent @JimBacon ?

JimBacon commented 1 year ago

Okay. I have an established method for doing this.

larspett commented 1 year ago

moth_list_swe_Errors_LP.xlsx Here are the two lists as separate tabs with comments included. Hope it makes sense @andrewvanbreda

DavidRoy commented 1 year ago

my suggestion is that we add a set of additional synonyms but keep our current list as the preferred names (aligned with the EU Moth Red List). @larspett could you produce a simplified file with two columns: 1. current name; 2. synonym to add @kazlauskis when creating species lists in the app, do you include both synonyms and preferred names? If not, I think we should

larspett commented 1 year ago

@DavidRoy I think @andrewvanbreda wanted to have it like this to double-check. I followed his instructions & suggestions, just put it in an excel sheet. Can of course merge columns but the entries needs involving some checks at both ends I think?

DavidRoy commented 1 year ago

ok, I'll leave this to you and andrew. These issues that get to 57 comments are very hard to follow.....

andrewvanbreda commented 1 year ago

Hi @DavidRoy @larspett The original idea was to change the import spreadsheet to match the database so the import would work. However a possibility would be to add the names as new synonyms and not change the original import sheet. Either way I think we needed Lars's feedback to make sure the alternate names I suggested are ok matches.

I think what is best is if I check what Lars has written, and then I will advise Jim on the choices he has for approach.

andrewvanbreda commented 1 year ago

Hi all,

OK I have tried my best to create an overview of the remaining elements of this issue so that everyone doesn't have to keep trawling going through the long thread (although situation is still a bit complicated).

1. @larspett Firstly There is another file that I didn't do any imports for that is referenced above (I have re-attached it here for convenience)

Scientific_and_Swedish_common_names_additions_and_mismatches_v2.txt

Everything in this file was already present in database apart from the last line

"Scientific mismatch Yponomeuta cagnagellus/malinellus benveds-/apelspinnmal cagnagellus"

There is no match for "Yponomeuta cagnagellus/malinellus" in the database, however "Yponomeuta cagnagella/malinellus" does exist. Lars: are you able to confirm it is OK to add the "benveds-/apelspinnmal cagnagellus" translation to that.

2. @JimBacon @DavidRoy @larspett I have looked through Lars's corrections to my import, and produced an attached file I think contains everything.

Simplified corrections.csv

There are six columns as follows

  1. "Import file latin (no DB match)" The original latin provided in the import file which there is no exact match for in the EBMS Moths species list.

  2. "Accepted alternative" An alternative that I suggested that Lars has agreed to.

  3. "Lars’s Request" A name that I think Lars is suggesting should be used, however it doesn't appear to me to be present in EBMS Moths.

  4. "Swedish translation" The Swedish to import.

  5. "Lars comment" Any comments made by Lars

  6. "Andrew comment" Any comments made by myself

So essentially, it is column 2, the "Accepted alternative" that you need to import the Swedish onto. However, this is still blank in a lot of cases, so an import won't work on those rows. For these blank ones, you see that "Lars’s Request" column is filled. This is the name that Lars is suggesting, but it looks to me to be missing from EBMS Moths. So in those cases you might need to work out between yourselves whether new items need adding to the EBMS Moths species list, or perhaps there are spelling issues causing there to be no match (an alternative could also be to find a synonym)

@JimBacon I have noticed when working on other projects, that the Indicia importer needs a particular text encoding to import none-UK characters correctly (e.g. ä) without error. I have found that "UTF-8 with bom" has been reliable in the past (obviously this might not apply to you depending on how you go about your import)

andrewvanbreda commented 1 year ago

@JimBacon Forgot to mention there is also the option that David mentioned above of importing the name from the original import file as a synonym of the existing name in the database. So in that case I think you would be importing the name in the first column "Import file latin (no DB match)" as a synonym of the name in the second column "Accepted alternative". I guess it just depends if you/David want those names in the species list.

larspett commented 1 year ago

@andrewvanbreda this looks fine, the only thing I find is that there is a traling "cagnagellus" added to the line "benveds-/apelspinnmal cagnagellus".

The line should be "benveds-/apelspinnmal" and the synonym etc should become: "Yponomeuta cagnagella/malinellus" <> "Yponomeuta cagnagellus/malinellus" <> "benveds-/apelspinnmal"

andrewvanbreda commented 1 year ago

Hi @larspett I have added the Swedish, but I have not added the synonym for that yet. David/Jim will decide whether they want the extra synonym names present, the other rows may need new entries also.

Vilius-Stankaitis commented 1 year ago

Hi, I have added this to the app. Let me know if any updates are required.

larspett commented 1 year ago

Are the translations forward only or updating present reports too? When I check on the app, previous reports are still scientific names

Vilius-Stankaitis commented 1 year ago

We wanted to update species data for other countries, so Swedish common moth names eg (Blåbärshakvecklare, Rönnvecklare, etc...) were included as well.

https://butterflies.app.flumens.io/demo.html

larspett commented 1 year ago

so in demo but not in production version of app yet?

Vilius-Stankaitis commented 1 year ago

Production as well v1.20.0

larspett commented 1 year ago

OK, I notice a couple of things, My top species have the names as common names as well as scientific names but the counts are incorrect (Orthosia gothica should be 3 but are 1), the submitted moth trap surveys are still as scientific names although common names are selected. I guess this is because they were submitted as scientific names before the translation was available?

From: Vilius Stankaitis @.> Sent: den 12 maj 2023 10:00 To: BiologicalRecordsCentre/ABLE @.> Cc: Lars Pettersson @.>; Mention @.> Subject: Re: [BiologicalRecordsCentre/ABLE] Swedish moth names (Issue #542)

Production as well v1.20.0

— Reply to this email directly, view it on GitHubhttps://github.com/BiologicalRecordsCentre/ABLE/issues/542#issuecomment-1545338213, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEVQXZG4VM54DQEG4SQBECLXFXUXTANCNFSM6AAAAAAWBHFMG4. You are receiving this because you were mentioned.Message ID: @.**@.>>

Vilius-Stankaitis commented 1 year ago

Mhm, I will investigate this. Cheers

larspett commented 1 year ago

@Vilius-Stankaitis I try to switch between common names in Swedish and scientific names but only get Swedish ones all the time (unless I enter the scientific name manually, then it sticks). I try changing the common/scientific option in settings but nothing happens. Shouldn't it?

Vilius-Stankaitis commented 1 year ago

I sent you an email last week with more details (could you check the span folder). I will create a new ticket.

larspett commented 1 year ago

@Vilius-Stankaitis hi haven't received such an email from you directly (not in the spam folder either). Can you please resend with cc to lars.b.pettersson@gmail.com. Could have been stopped before the mailbox

Vilius-Stankaitis commented 1 year ago

Hi all,

a few mistakes:

"Gammafly" is not assigned to the Swedish common name.

"Gammaeule" currently is assigned to the Swedish common names list (should be assigned to German).

Could someone fix this?

Thank you

JimBacon commented 1 year ago

@Vilius-Stankaitis I've fixed the names for Autographa gamma mentioned in your comment above, thanks.

I'm going to import another 221 Swedish moth names this afternoon that have been matched to scientific names already in our list. There are a further 88 Swedish names still to come once they have a matching scientific name in the moth list.

JimBacon commented 1 year ago

I've uploaded the following 232 Swedish moth names; a few more than expected as I found some synonyms among the names I thought were unmatched. I haven't added the synonyms at the moment.

swedish-moth-names.csv

Furthermore, I

This leaves 74 unmatched names still to import.

larspett commented 1 year ago
JimBacon commented 1 year ago

Hi @larspett. In your spreadsheet, moth_list_swe_Errors_LP.xlsx, (in https://github.com/BiologicalRecordsCentre/ABLE/issues/542#issuecomment-1527720391) you wrote that P.brevilineatellus is a synonym of P. salicicolella I was about to conclude there was a copy and paste error in the spreadsheet but then the Catalogue of Life also states they are synonyms. I don't know what to do!

larspett commented 1 year ago

ah taxonomy the joy of it :) Georg, who compiled the list, found the synonym listing whereas I found a different (older) listing where they were not considered synonyms. All these four are apparently considered synonyms and the common name should be Vinkelbandad videguldmal . Sorry about that

larspett commented 1 year ago

@JimBacon I noticed that there are no translations for species like Korscheltellus lupulinus (lupulina), are we missing common names for Hepialidae? The image recognition fails too. ObsIdentify solves the image recognition 100%

JimBacon commented 1 year ago

The translation for Korscheltellus lupulina was one I added on Monday so I would not expect it to be present in the app yet.

Can you attach the image which is not giving the result you expect, so that I can investigate. My guess is that the photo is identified as Korscheltellus lupulinus which would not be matched against our list as we have Korscheltellus lupulina without the synonym, at present.

larspett commented 1 year ago

Sounds likely, here it is (not cropped here) image

JimBacon commented 1 year ago

It's more depressing than I thought. The image classifier returns another synonym altogether, Pharmacis lupulina.

We need to do something to align the eBMS and the NIA taxonomy. David's comment suggests this is planned.

larspett commented 1 year ago

What is the NIA taxonomy? Couldn't easily find my way through all NIA acronyms

JimBacon commented 1 year ago

Nature Identification API

DavidRoy commented 1 year ago

We are switching the updated NIA soon. We’ll also review taxonomy then

DavidRoy commented 1 year ago

NIA = ObsIdentify

kazlauskis commented 8 months ago

The app part is done for this and includes the Swedish moth names. Can we close it now?

DavidRoy commented 8 months ago

@larspett

JimBacon commented 8 months ago

I can see a comment of mine from 22 May: 'This leaves 74 unmatched names still to import' Don't close until I have been able to check what happened to those.

There is a separate issue, https://github.com/Indicia-Team/drupal-8-modules-indicia-ai/issues/2, regarding updating the NIA classifier. This is also still to do.

larspett commented 8 months ago

Last time I had a look there were still numerous names in German. I can have a look now. Which ones are the unmatched names @JimBacon ?

larspett commented 8 months ago

This, for instance, is an attempt to view butterfly data but what appears as top entries are bumblebees and hoverfles image

larspett commented 8 months ago

Some more examples: image

larspett commented 8 months ago

Looks like synonyms that are missing