BiologicalRecordsCentre / ABLE

Assessing ButterfLies in Europe project repository
2 stars 3 forks source link

Strict flight period for common butterlfy species- rejected on the eBMS verification #724

Open CrisSevilleja opened 2 months ago

CrisSevilleja commented 2 months ago

Hello,

The Czech coordinator wanted to make the validations on the verification system and realised the rulesets for flagging butterflies are quite strict for some common species. He pointed to Maniola jurtina and Coenonympha pamphilus are rejected in months when they usually fly. I checked the Rulesets for flagging butterflies in issue #511 and found M. jurtina flying between June and mid-August. I took it from this document flightperiod_95perc.csv

Species | BGR | Month | perido | nrec Maniola jurtina | Continental | 7 | II | 35396 Maniola jurtina | Continental | 7 | I | 30685 Maniola jurtina | Continental | 6 | II | 25050 Maniola jurtina | Continental | 8 | I | 22809 Maniola jurtina | Continental | 1 | I | 11454 Maniola jurtina | Continental | 6 | I | 11251

The flying period of M. jurtina in Czechia is ca since May 20 till ca September 15, see https://portal.nature.cz/w/druh-31746#/ and for Copenonympha pamphilus, it ranges from mid April to mid October (https://portal.nature.cz/w/druh-31751#/).

I think we can correct the flight periods of those two species in Czechia. Still, it would be best to check common species, like Pyronia tithonus, Polyommatus thersites, Vanessa atalanta, Celastrina argiolus among others when they are rejected and correct them for all countries. Or another option to check which species are more rejected in the verification system and determine which one have a longer flight period.

chrisvanswaay commented 2 months ago

@CrisSevilleja Can't find the script anymore (but it will be somewhere), but I'm quite sure I used the 90% or 95% quantiles. As for such abundant species numbers in peak season are very high, the tails can still have substantial numbers (though small in percentage). But in CZ M jurtina before 15 May will be doubtful, the question is if you want to skip all M jurtina in May (so also those before 15 May) or the other way around. I fear there is no easy fix.

chrisvanswaay commented 2 months ago

PS all was restricted to months, so you have to choose to either include May or not.

CrisSevilleja commented 2 months ago

@chrisvanswaay I wouldn't accept records before 15May in Czechia of M.jurtina but for the tail at the end of the flight period is almost a month. The rulesets are set up for mid-August and it can be seen until mid-September.

I am posting this because other coordinators told me the rulesets did not include common species, like in Austria, and I noticed this in Spain as well. I am just wondering if a new check can be done to improve and include more of those species. We can involve more coordinators to check the flight periods of all their country species.

CrisSevilleja commented 2 months ago

ah I though it was restricted to periods and not months.

chrisvanswaay commented 2 months ago

I'll try to find the script (there are so many, that I sometimes forget where I put them).

DavidRoy commented 2 months ago

I noticed this issue with common species too. Hopefully Chris can update the rules as we don't want to be manually adjusting them?

Also, it is worth noting that these automated checks are only to adds flags to records - they do not lead to accepted/rejected status as that is only done by the human verifiers. We could use these rules to automate the verification but it's good to be confident that they work in all (most) situations

chrisvanswaay commented 2 months ago

@DavidRoy Can you find back when I sent you the file with the flightperiods? And what the name was? That would help me to trace the script.

chrisvanswaay commented 2 months ago

With the script it would be easy to change the quantiles and run it again.

DavidRoy commented 2 months ago

it was captured by this issue https://github.com/BiologicalRecordsCentre/ABLE/issues/511 which also links to an earlier issue. There is some discussion on the approach and the file you supplied to us

chrisvanswaay commented 2 months ago

Thanks, that helped, found the script. I used GBIF data for this, so if for some countries data is missing, then these species will be missing. Here is the table for M jurtina in Continental:

flight_period_M_jurtina_Continental.xlsx

The top rows: <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

species | code | month | period | nrec | cumsum | tot | cumperc -- | -- | -- | -- | -- | -- | -- | -- Maniola jurtina | Continental | 7 | II | 35396 | 35396 | 153270 | 23,09388661 Maniola jurtina | Continental | 7 | I | 30685 | 66081 | 153270 | 43,11411235 Maniola jurtina | Continental | 6 | II | 25050 | 91131 | 153270 | 59,45781953 Maniola jurtina | Continental | 8 | I | 22809 | 113940 | 153270 | 74,33940106 Maniola jurtina | Continental | 1 | I | 11454 | 125394 | 153270 | 81,81248777 Maniola jurtina | Continental | 6 | I | 11251 | 136645 | 153270 | 89,15312847 Maniola jurtina | Continental | 8 | II | 11010 | 147655 | 153270 | 96,33653031 Maniola jurtina | Continental | 5 | II | 3114 | 150769 | 153270 | 98,36823906 Maniola jurtina | Continental | 9 | I | 1368 | 152137 | 153270 | 99,26078163 Maniola jurtina | Continental | 5 | I | 616 | 152753 | 153270 | 99,66268676 Maniola jurtina | Continental | 9 | II | 316 | 153069 | 153270 | 99,86885888

The columns will be clear. I used the cumperc (cumulative percentage) of the records (not numbers). I put the border at 95%, but we can change that to any other percentage.

chrisvanswaay commented 2 months ago

And indeed (old man forgot) I did not do it by month, but by the first and second half of the month.

larspett commented 2 months ago

Is this what triggers warnings from iRecord too? Because iRecord flags Brimstone in Sweden in August as questionable which it is not

18 sep. 2024 kl. 16:50 skrev David Roy (CEH) @.***>:



it was captured by this issue #511https://github.com/BiologicalRecordsCentre/ABLE/issues/511 which also links to an earlier issue. There is some discussion on the approach and the file you supplied to us

— Reply to this email directly, view it on GitHubhttps://github.com/BiologicalRecordsCentre/ABLE/issues/724#issuecomment-2358688130, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEVQXZDB62SGWI3ZIQUR3DLZXGHKNAVCNFSM6AAAAABONZC5ZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJYGY4DQMJTGA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

chrisvanswaay commented 2 months ago

PS just noticed month 1 I is also in, probably records in GBIF with data on the first of January. I should have skipped those.

chrisvanswaay commented 2 months ago

Let me know if I have to run again (and skip all records on 1 Jan in GBIF) and what cumperc I should choose. Like I said this will always be a rough estimate, the only alternative is people filling in their own borders somewhere.

larspett commented 2 months ago

Could you share the script (or email it) would be great to adapt our national validation to the same rules

18 sep. 2024 kl. 17:07 skrev Chris van Swaay @.***>:



Thanks, that helped, found the script. I used GBIF data for this, so if for some countries data is missing, then these species will be missing. Here is the table for M jurtina in Continental:

flight_period_M_jurtina_Continental.xlsxhttps://github.com/user-attachments/files/17046047/flight_period_M_jurtina_Continental.xlsx

The top rows:

species code month period nrec cumsum tot cumperc Maniola jurtina Continental 7 II 35396 35396 153270 23,09388661 Maniola jurtina Continental 7 I 30685 66081 153270 43,11411235 Maniola jurtina Continental 6 II 25050 91131 153270 59,45781953 Maniola jurtina Continental 8 I 22809 113940 153270 74,33940106 Maniola jurtina Continental 1 I 11454 125394 153270 81,81248777 Maniola jurtina Continental 6 I 11251 136645 153270 89,15312847 Maniola jurtina Continental 8 II 11010 147655 153270 96,33653031 Maniola jurtina Continental 5 II 3114 150769 153270 98,36823906 Maniola jurtina Continental 9 I 1368 152137 153270 99,26078163 Maniola jurtina Continental 5 I 616 152753 153270 99,66268676 Maniola jurtina Continental 9 II 316 153069 153270 99,86885888

The columns will be clear. I used the cumperc (cumulative percentage) of the records (not numbers). I put the border at 95%, but we can change that to any other percentage.

— Reply to this email directly, view it on GitHubhttps://github.com/BiologicalRecordsCentre/ABLE/issues/724#issuecomment-2358734029, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEVQXZDZ27ZRIHDZENWUNRDZXGJJLAVCNFSM6AAAAABONZC5ZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJYG4ZTIMBSHE. You are receiving this because you are subscribed to this thread.Message ID: @.***>

chrisvanswaay commented 2 months ago

On the email...

CrisSevilleja commented 2 months ago

that new run of the script is more accurate I think. I saw the month of January before, indeed it has to be a mistake. Good that you included by periods and not only months. thanks Chris

zdfric commented 2 months ago

Dear all, I would not expect such a long discussion about this topic! What is wrong with Maniola jurtina from mid-May? Usually, it starts in Czechia in June, depending on altitude and area, however, especially this year we have plenty of records from May. Typical for our country is that it is on the edge between Continental and Atlantic climate and sometimes this can lead to unpredictable strange occurrence patterns. Like this year we had a lot of records of second generation of Coenonympha arcania, Limenitis camilla etc.

CrisSevilleja commented 1 month ago

Where are we with this issue @chrisvanswaay and @DavidRoy ? I have another coordinator (Switzerland) asking for the flight periods to be corrected before they start verifying the records (many red thumbs at the moment).

Perhaps one option would be to use the results produced by Chris to run the eBMS verification and re-run all pending records and put the final list of periods per country/region somewhere on the eBMS. Or ask the coordinators to check this list, although this will be less efficient.