BiologicalRecordsCentre / SPRING

Repository for tracking issues for the SPRING (EU Pollinator Monitoring) project
GNU General Public License v3.0
0 stars 0 forks source link

Upload of sample data for Spain #117

Closed DavidRoy closed 5 months ago

DavidRoy commented 9 months ago

Data sent by email. Requirement is to upload Pan Trap occurrence data for samples already entered.

andrewvanbreda commented 9 months ago

Hi @DavidRoy,

I have looked at the data, in theory this should not be difficult to import. There are a couple of questions below, but if we can answer these, I don't see any technical problems.

Essentially as we are importing onto existing samples, the only sample field we need to include in the import is the Sample ID (I assume you aren't attempting to update any of the details on the sample). Then I simply select that Sample ID column as the existing data column (and include the occurrence related columns on the row)

However, there are couple questions I need answering about this

  1. As noted in my email yesterday, are you wanting a species attribute to indicate no species, or do we simply consider no occurrences are zero observation? If the answer is you want this as an attribute (like Plant Portal has), then let me know what you want to do about the website dat entry screens, or whether this attribute will be used only on this import. It could perhaps be confusing though if it only applies to this import and isn't used elsewhere.

  2. On the "Insect SPECIES data from pan-trap samples" page on SPRING, it looks to me that the species rows have an expectation that a Specimen Code and also an Abundance will be entered, neither of these are in the import file. Having said that, these do not look like these are forced as required, so if you are able to confirm you are happy to continue with this information missing, we can continue with the import without these.

Andy

DavidRoy commented 9 months ago

hi @andrewvanbreda on 1. can you raise a separate issue for this as it doesn't affect the occurrence upload

  1. @iverkaik can you confirm that the abundance should = 1 (i.e. single specimens). And confirm that you do not have or want specimen codes uploaded
iverkaik commented 9 months ago

hi @DavidRoy @andrewvanbreda Yes, the abundance is =1 for each row except for the cases that an N/A is written (no SPRING pollinators found Bees or Hoverflies or Butterflies). We do not need the specimens codes uploaded, this is just to double check that the specimens' name is written correctly.

DavidRoy commented 9 months ago

thanks @iverkaik @andrewvanbreda for neatness, please add abundance = 1. Specimen code is null. Remove rows with N/A

DavidRoy commented 9 months ago

@andrewvanbreda can you also provide screenshots of the process

andrewvanbreda commented 9 months ago
  1. 118

  2. I will attach the prepared import file to this issue before import, and will screenshot the selections I intend to make with the importer so you can see.

iverkaik commented 9 months ago

@andrewvanbreda please consider that this is just a subset of the complete list (some specimens are still being identified by experts) for these surveys and sites, we should have the completed list nex weekk and also confirmed that this system works.

thank you!

andrewvanbreda commented 9 months ago

Hi @iverkaik @DavidRoy ,

Please find attached an example of the prepared file. I have also included the kind of selections I would expect to make with the importer.

The importer essentially has two main screens. The first screen is quite long so is split over two screenshots.

I have not actually done a proper test import of this yet because I have run into some further questions (there may be other small adjustments needed to the file such as changing M/F to Male/Female, but I will worry about those during testing)

  1. On the first screen you will see a couple of the drop-downs relating to A. "Record Status" (I think this is labelled wrong and should be called Release Status) which includes the options "Unconfirmed - not reviewed" "Accepted" "Data entry still in progress"

"Record status" which includes the options "Data entry complete/unverified" "Verified" "Data entry still in progress"

I assume the Release Status should be Accepted, but let me know what you want the second Record Status option should be.

  1. I had a thought actually about this. On the website page "Insect SPECIES data from pan-trap samples" page there are 3 grids.

What we do in Indicia is actually save the ID of the grid as an occurrence attribute. This allows the system to know which grid to load the occurrence onto.

This means I need to know what type of species (Butterfly, Hoverfly, Bee) each row is. I am not sure how to go about this, without looking up each species now to see what it is. Unless they are all the same?

If this is a problem, I will need to test the behaviour would be like without this attribute with the type of setup this page has.

  1. @iverkaik In terms of the data being a subset of further data.

Will the new list next week include the data from the file you have already provided? As I need to be careful I am not importing the same data twice.

Andy

Import screen 1A

Import screen 1B

Import screen 2

SP_subsetPROOF EDITED (UTF-8).csv

DavidRoy commented 9 months ago

@andrewvanbreda some response: Please set release status = Accepted Please set record status = Data entry complete/unverified

For the taxon group/grid, it looks like this test is all for bees but.. .. @iverkaik can you confirm this test dataset is all bees. For the full import, could you retain the column for 'taxon_list_id' when you create the upload files.
.. @andrewvanbreda this allows you to determine the taxon_group/grid for upload (272=hoverflies;273=bees; 251=butterflies)

iverkaik commented 8 months ago

Hi @andrewvanbreda @DavidRoy , No, the majority are bees in the list but there are few hoverflies and some butterflies. Should we need to create different data sets for each group or just create a column with their group (bees, hoverflies, butterflies)? I can do the latter easily as we have it this way in the database

andrewvanbreda commented 8 months ago

Hi @iverkaik
If you create different files for each it is probably a bit clearer and better. I won't be doing the import as one long whole file anyway (smaller imports, then can check for problems). If you put butterflies, bees, butterflies into separate files I can also do things like select the appropriate species list on that first importer screen, and then the computer will better validate against that list during the import.

@DavidRoy @iverkaik I guess my only question is, should I wait to receive the whole dataset before doing the import? It depends on whether that final dataset is going to include the data from the subset file you have already provided.

iverkaik commented 8 months ago

Hi @andrewvanbreda Here is the file with the butterflies for the SP_MVS15 station, it is complete (9 specimens). Let me know how it works. SP_MVS15Butterflies.csv

DavidRoy commented 8 months ago

@andrewvanbreda the current files are to test the import. If you are comfortable with the format and process it's ok if you want to wait for the full files. @iverkaik would this work for you? What is your schedule for finalising them? Would you rather do this in stages?

iverkaik commented 8 months ago

Hi @DavidRoy @andrewvanbreda we hope we can have them completed for next week. We can now try with the butterflies to confirm that the process is OK? just to make sure which format we need for all the files. I will also need the list for the Portuguese and French sites to do the same process. Spanish volunteers are also finishing to upload their results in the page. thank you

andrewvanbreda commented 8 months ago

Hi @iverkaik Yes I am preparing to do a test import on the test database (warehouse). It is a little bit fiddly as the surveys are not setup the same on live, so my test isn't perfect. Then I will import that small file onto live and will examine the result. I will then message to let you know what happened.

I will let David comment on other work to be done.

andrewvanbreda commented 8 months ago

Hi @DavidRoy @iverkaik I had to abandon my test on the test warehouse because I couldn't access the import screen.

So in the end I have done this on live, as it is only a few rows this is low risk, plus as only a few rows we can manually edit if needed.

It looks like the import has worked to me. Are you able to check you are happy with the result in either the Warehouse User Interface or on the website?

I have included the settings and file I used below. The main difference I made to what I showed you before, is that I realised I should select far fewer columns related to the sample on the mappings page, as that data is already present on the sample. The only problem with this is actually it does rely on the Sample ID being correct. However I think selecting more than one sample column doesn't help validation (such as date, pan trap number) I think if these are wrong in the file it won't be rejected, instead the sample on the database would be overwritten with incorrect data, so I left it as just the Sample ID column only.

Screen 1

Screen 2

Screen 3

SP_MVS15Butterflies.csv

DavidRoy commented 8 months ago

@andrewvanbreda looks good to me @iverkaik to confirm

iverkaik commented 8 months ago

hi @andrewvanbreda @DavidRoy I just checked in the web and the imported data is there :) We will prepare the identification files for each group (butterflies, hoverflies, bees) And can you send me the pan trap files from Portugal? this database has been completed, so this one can be prepared and fully uploaded.

Thank you, iraima

andrewvanbreda commented 8 months ago

Am assuming this is something David is going to provide? @DavidRoy Let me know if you need me to do anything to retrieve data.

iverkaik commented 8 months ago

yes, I already have it :) thank you @andrewvanbreda

iverkaik commented 8 months ago

Hi @andrewvanbreda I have the data for Portugal ready. Find the three separate files, one for bees (BEE), other for butterflies (BUT) and last for hoverflies (HOV).

PO_SPRING_BEE.csv PO_SPRING_BUT.csv PO_SPRING_HOV.csv

There are two species of Hoverflies that do not have a code as they do not appear in the taxon list, I think they should be included: Xanthandrus azorensis Sphaerophoria nigra

You will find them in the lines 142 (Xanthandrus azorensis) and 143, 144 & 150 (Sphaerophoria nigra).

Please let me know if this seems all OK or have any questions.

Thank you, iraima

andrewvanbreda commented 8 months ago

Hi @iverkaik Thanks for preparing these, I will let you know thoughts once I have had a chance to look through these. I won't do any importing until I have given feedback.

iverkaik commented 8 months ago

allright @andrewvanbreda we will keep working on the rest of the data from France and Spain

thank you

andrewvanbreda commented 8 months ago

Hi @iverkaik

I have prepared the files so they can be imported (attached). A couple of questions.

  1. Shall I add an Abundance column containing 1 to the hoverflies file, it is missing there.

  2. Can I check some of the characters are displaying correctly in the file.

Can you just confirm these are displaying correctly

"C. PÈrez-BaÒÛn - A. Aracil - A. Vuji?" "M·rio Boieiro et al." "Ante Vuji?"

Thanks

Andy

PO_SPRING_BEE_altered.csv PO_SPRING_BUT_altered.csv PO_SPRING_HOV_altered.csv

iverkaik commented 8 months ago

Hi @andrewvanbreda,

  1. Yes, "abundance" should be set to 1. Sorry the missing column in the file.
  2. Some names contain accents and Spanish characters. I'm unsure if the system accepts them. The names are:

C. Pérez-Bañón - A. Aracil - A. Vujić Mário Boieiro et al. A. Vujić

In the Hoverflies file, lines 114, 117, 119, 120, 124 should display "Renata Santos - Ante Vujić". It seems the name of the second determiner appears on the "sex" column. I believe this happened because they were separated by commas. For the next files, we'll ensure that a hyphen is used.

Thank you, iraima

andrewvanbreda commented 8 months ago

@iverkaik I have attached fixed files. Perhaps you could have a look at them again as a second opinion before I attempt an import again. PO_SPRING_BEE_altered.csv PO_SPRING_BUT_altered.csv PO_SPRING_HOV_altered.csv

iverkaik commented 8 months ago

HI @andrewvanbreda Looks all good to me!

iverkaik commented 8 months ago

Hi @andrewvanbreda,

Find here the files for the specimens from France.

Similar to what occured in Portugal, there are some specimens that should be added to the list: Syrphidae (as a family for those that have not been identified to genus level) Chrysotoxum lessonae

And for bees: Hylaeus purpurissatus Andrena afzeliella

Furthermore, in the bees and hoverflies files, there is a new column for adding comments during the identification process. This corresponds with the column that you can also find in the application. It will contain information about specimens that have been identified to subgenus or group level and are not in the list, as well as any reasons why they were not identified to the species level.

Please let me know if you need any further clarifications on these files. Thank you, iraima

andrewvanbreda commented 8 months ago

Hi @iverkaik, OK thanks for the files. I won't be able to look at this until next week now because of work on urgent site upgrades, but I will let you know how I get on with both the imports.

iverkaik commented 8 months ago

allright @andrewvanbreda During the bext days I will also get ready the files for the samples of Spain. I just remember that there was a volunteers from France that tried to add specimens last year. Please consider that the only identifications form the pantraps are the files that I sent you. If there is information in some pantrap this was only as a test.

thank you, iraima

iverkaik commented 8 months ago

Hi @andrewvanbreda,

I hope the upload from France is proceeding smoothly. The list of identifications for Spain is now ready. I am just waiting for the final sample list to send it to you all, complete.

In addition to the bee species already mentioned, there are a couple more missing from the list. So, the "new" bees we need are:

  1. Amegilla talaris
  2. Andrena afzeliella
  3. Hylaeus purpurissatus
  4. Hylaeus gibbus

Could you please provide me with the codes for these species?

Thank you, iraima

andrewvanbreda commented 8 months ago

Hi @iverkaik I will add these and let you know when done. Am not sure what you mean by "codes"?, so let me know and I will try to send you the information.

I have not done any of the imports yet, I have not forgotten and they are on my task list. I am helping with upgrades relating to several websites (inluding SPRING), if not done these would affect the running of the wbesites, so I have to give that task higher priority. This is due by the end of the week, so I should have time for the import then, or before if I finish early.

iverkaik commented 8 months ago

Hi @andrewvanbreda

Thank you for your quick response and information. About the codes, these species do not appear on the bee list, hence they do not have an associated code. For example, species that do appear such as Panurgus banksianus has an id: 544312 or Ceratina dentiventris has an id: 543046. Hope I explianed myself better.

and good luck with the upgrades! thank you, iraima

andrewvanbreda commented 8 months ago

Hi @iverkaik Yes I understand what these numbers are now, these are what we call the taxa taxon list id in the database. That is fine I know what you mean by code now. However, note that the code for Ceratina dentiventris is 543056 (543046 is Ceratina albosticta), I hope this is just a small mistake in your message and not that you had the wrong information (see attached screenshot)

Screen Shot 2023-11-09 at 09 04 27

iverkaik commented 8 months ago

you are right @andrewvanbreda, was an error copying the number, sorry

so yes, this is what I mean, if you can provide me with the ones that are missing, great!

thank you, iraima

DavidRoy commented 8 months ago

@iverkaik I'll send you an updated list of samples and an updated taxon list

iverkaik commented 8 months ago

Hi @andrewvanbreda,

Find here the files for the specimens from Spain. SP_SPRING_BEE.csv SP_SPRING_BUT.csv SP_SPRING_HOV.csv

I could not add the id for the following bees are not listed:

  1. Amegilla talaris
  2. Andrena afzeliella
  3. Hylaeus purpurissatus

Please let me know if you have any question! Thank you

iverkaik commented 8 months ago

Hello again @andrewvanbreda,

Could you please confirm whether you've initiated the uploading of the samples from France? Additional samples have been provided by volunteers, and I can send you an updated version. However, if you have already begun, I can proceed the uploading the new information through the application.

Thank you, iraima

andrewvanbreda commented 8 months ago

Hi @iverkaik I have not done any of the uploads or started yet, so please provide an updated version. Thanks Andy

iverkaik commented 8 months ago

Allright, here then the updated from France (I deleted the ones before to avoid any confussion)

FR_SPRING_HOV.csv FR_SPRING_BUT.csv FR_SPRING_BEE.csv

In these files the taxa missing is: Syrphidae (as a family for those that have not been identified to genus level) Chrysotoxum lessonae

And for bees: Hylaeus purpurissatus

Thank you!

andrewvanbreda commented 8 months ago

Hi @iverkaik

I added the following taxa Hoverflies Xanthandrus azorensis (ID 624149) Sphaerophoria nigra (ID 624150) Syrphidae (ID 624151) Chrysotoxum lessonae (ID 624152)

Bees Hylaeus purpurissatus (ID 624153) Andrena afzeliella (ID 624154) Amegilla talaris (ID 624155) Hylaeus gibbus (ID 624156)

Do not worry about editing anything you already uploaded, I can make the changes before I do imports (I won't be able to do the import today). This will avoid too many files being attached to this Github thread.

I will post the files I actually us to import here, so you will be able to see what has been imported

iverkaik commented 8 months ago

excellent @andrewvanbreda thank you, please let me know if you have any question, iraima

andrewvanbreda commented 8 months ago

Hi @iverkaik Have done the original Portugal import now. These files 1 PO_SPRING_BUT_altered.csv 2 PO_SPRING_BEE_altered.csv 3 PO_SPRING_HOV_altered.csv

iverkaik commented 8 months ago

Excellent @andrewvanbreda, and thank you for the files

andrewvanbreda commented 7 months ago

Hi @iverkaik Spain data has been imported as per the attached files

1. SP_SPRING_BUT_ready.csv 2A. SP_SPRING_BEE_ready.csv 2B. SP_SPRING_BEE_ready.csv 3. SP_SPRING_HOV_ready.csv

iverkaik commented 7 months ago

great, great news @andrewvanbreda thank you!

andrewvanbreda commented 7 months ago

@iverkaik @DavidRoy France data imported. So I think that is everything complete 1. FR_SPRING_BUT_ready.csv 2. FR_SPRING_HOV_ready.csv 3. FR_SPRING_BEE_ready.csv

iverkaik commented 7 months ago

superb @andrewvanbreda thank you!

andrewvanbreda commented 7 months ago

@DavidRoy @iverkaik Please close once you are happy with the import.

andrewvanbreda commented 7 months ago

Hi @DavidRoy I have just found an unexpected consequence of the import. I have now received an email from a verifier about it (see attached). Although I am not sure why it is saying the occurrence is edited.

I also noticed another oddity. Due to records going through the verification system, but the EU PoMS (SPRING) project not actually being displayed as a survey in iRecord, the links in the email actually go nowhere (following the link says the record cannot be found)

I guess just let me know if you want me to do anything with the notification emails (e.g. forward them)

Also let me know if you want the second point looked at (e.g. raised in Github)

1 - Email Top 2 - Email Bottom