Closed DavidRoy closed 8 months ago
Hi @DavidRoy,
I have looked at the data, in theory this should not be difficult to import. There are a couple of questions below, but if we can answer these, I don't see any technical problems.
Essentially as we are importing onto existing samples, the only sample field we need to include in the import is the Sample ID (I assume you aren't attempting to update any of the details on the sample). Then I simply select that Sample ID column as the existing data column (and include the occurrence related columns on the row)
However, there are couple questions I need answering about this
As noted in my email yesterday, are you wanting a species attribute to indicate no species, or do we simply consider no occurrences are zero observation? If the answer is you want this as an attribute (like Plant Portal has), then let me know what you want to do about the website dat entry screens, or whether this attribute will be used only on this import. It could perhaps be confusing though if it only applies to this import and isn't used elsewhere.
On the "Insect SPECIES data from pan-trap samples" page on SPRING, it looks to me that the species rows have an expectation that a Specimen Code and also an Abundance will be entered, neither of these are in the import file. Having said that, these do not look like these are forced as required, so if you are able to confirm you are happy to continue with this information missing, we can continue with the import without these.
Andy
hi @andrewvanbreda on 1. can you raise a separate issue for this as it doesn't affect the occurrence upload
hi @DavidRoy @andrewvanbreda Yes, the abundance is =1 for each row except for the cases that an N/A is written (no SPRING pollinators found Bees or Hoverflies or Butterflies). We do not need the specimens codes uploaded, this is just to double check that the specimens' name is written correctly.
thanks @iverkaik @andrewvanbreda for neatness, please add abundance = 1. Specimen code is null. Remove rows with N/A
@andrewvanbreda can you also provide screenshots of the process
I will attach the prepared import file to this issue before import, and will screenshot the selections I intend to make with the importer so you can see.
@andrewvanbreda please consider that this is just a subset of the complete list (some specimens are still being identified by experts) for these surveys and sites, we should have the completed list nex weekk and also confirmed that this system works.
thank you!
Hi @iverkaik @DavidRoy ,
Please find attached an example of the prepared file. I have also included the kind of selections I would expect to make with the importer.
The importer essentially has two main screens. The first screen is quite long so is split over two screenshots.
I have not actually done a proper test import of this yet because I have run into some further questions (there may be other small adjustments needed to the file such as changing M/F to Male/Female, but I will worry about those during testing)
"Record status" which includes the options "Data entry complete/unverified" "Verified" "Data entry still in progress"
I assume the Release Status should be Accepted, but let me know what you want the second Record Status option should be.
What we do in Indicia is actually save the ID of the grid as an occurrence attribute. This allows the system to know which grid to load the occurrence onto.
This means I need to know what type of species (Butterfly, Hoverfly, Bee) each row is. I am not sure how to go about this, without looking up each species now to see what it is. Unless they are all the same?
If this is a problem, I will need to test the behaviour would be like without this attribute with the type of setup this page has.
@iverkaik In terms of the data being a subset of further data.
Will the new list next week include the data from the file you have already provided? As I need to be careful I am not importing the same data twice.
Andy
@andrewvanbreda some response: Please set release status = Accepted Please set record status = Data entry complete/unverified
For the taxon group/grid, it looks like this test is all for bees but..
.. @iverkaik can you confirm this test dataset is all bees. For the full import, could you retain the column for 'taxon_list_id' when you create the upload files.
.. @andrewvanbreda this allows you to determine the taxon_group/grid for upload (272=hoverflies;273=bees; 251=butterflies)
Hi @andrewvanbreda @DavidRoy , No, the majority are bees in the list but there are few hoverflies and some butterflies. Should we need to create different data sets for each group or just create a column with their group (bees, hoverflies, butterflies)? I can do the latter easily as we have it this way in the database
Hi @iverkaik
If you create different files for each it is probably a bit clearer and better.
I won't be doing the import as one long whole file anyway (smaller imports, then can check for problems).
If you put butterflies, bees, butterflies into separate files I can also do things like select the appropriate species list on that first importer screen, and then the computer will better validate against that list during the import.
@DavidRoy @iverkaik I guess my only question is, should I wait to receive the whole dataset before doing the import? It depends on whether that final dataset is going to include the data from the subset file you have already provided.
Hi @andrewvanbreda Here is the file with the butterflies for the SP_MVS15 station, it is complete (9 specimens). Let me know how it works. SP_MVS15Butterflies.csv
@andrewvanbreda the current files are to test the import. If you are comfortable with the format and process it's ok if you want to wait for the full files. @iverkaik would this work for you? What is your schedule for finalising them? Would you rather do this in stages?
Hi @DavidRoy @andrewvanbreda we hope we can have them completed for next week. We can now try with the butterflies to confirm that the process is OK? just to make sure which format we need for all the files. I will also need the list for the Portuguese and French sites to do the same process. Spanish volunteers are also finishing to upload their results in the page. thank you
Hi @iverkaik Yes I am preparing to do a test import on the test database (warehouse). It is a little bit fiddly as the surveys are not setup the same on live, so my test isn't perfect. Then I will import that small file onto live and will examine the result. I will then message to let you know what happened.
I will let David comment on other work to be done.
Hi @DavidRoy @iverkaik I had to abandon my test on the test warehouse because I couldn't access the import screen.
So in the end I have done this on live, as it is only a few rows this is low risk, plus as only a few rows we can manually edit if needed.
It looks like the import has worked to me. Are you able to check you are happy with the result in either the Warehouse User Interface or on the website?
I have included the settings and file I used below. The main difference I made to what I showed you before, is that I realised I should select far fewer columns related to the sample on the mappings page, as that data is already present on the sample. The only problem with this is actually it does rely on the Sample ID being correct. However I think selecting more than one sample column doesn't help validation (such as date, pan trap number) I think if these are wrong in the file it won't be rejected, instead the sample on the database would be overwritten with incorrect data, so I left it as just the Sample ID column only.
@andrewvanbreda looks good to me @iverkaik to confirm
hi @andrewvanbreda @DavidRoy I just checked in the web and the imported data is there :) We will prepare the identification files for each group (butterflies, hoverflies, bees) And can you send me the pan trap files from Portugal? this database has been completed, so this one can be prepared and fully uploaded.
Thank you, iraima
Am assuming this is something David is going to provide? @DavidRoy Let me know if you need me to do anything to retrieve data.
yes, I already have it :) thank you @andrewvanbreda
Hi @andrewvanbreda I have the data for Portugal ready. Find the three separate files, one for bees (BEE), other for butterflies (BUT) and last for hoverflies (HOV).
PO_SPRING_BEE.csv PO_SPRING_BUT.csv PO_SPRING_HOV.csv
There are two species of Hoverflies that do not have a code as they do not appear in the taxon list, I think they should be included: Xanthandrus azorensis Sphaerophoria nigra
You will find them in the lines 142 (Xanthandrus azorensis) and 143, 144 & 150 (Sphaerophoria nigra).
Please let me know if this seems all OK or have any questions.
Thank you, iraima
Hi @iverkaik Thanks for preparing these, I will let you know thoughts once I have had a chance to look through these. I won't do any importing until I have given feedback.
allright @andrewvanbreda we will keep working on the rest of the data from France and Spain
thank you
Hi @iverkaik
I have prepared the files so they can be imported (attached). A couple of questions.
Shall I add an Abundance column containing 1 to the hoverflies file, it is missing there.
Can I check some of the characters are displaying correctly in the file.
Can you just confirm these are displaying correctly
"C. PÈrez-BaÒÛn - A. Aracil - A. Vuji?" "M·rio Boieiro et al." "Ante Vuji?"
Thanks
Andy
PO_SPRING_BEE_altered.csv PO_SPRING_BUT_altered.csv PO_SPRING_HOV_altered.csv
Hi @andrewvanbreda,
C. Pérez-Bañón - A. Aracil - A. Vujić Mário Boieiro et al. A. Vujić
In the Hoverflies file, lines 114, 117, 119, 120, 124 should display "Renata Santos - Ante Vujić". It seems the name of the second determiner appears on the "sex" column. I believe this happened because they were separated by commas. For the next files, we'll ensure that a hyphen is used.
Thank you, iraima
@iverkaik I have attached fixed files. Perhaps you could have a look at them again as a second opinion before I attempt an import again. PO_SPRING_BEE_altered.csv PO_SPRING_BUT_altered.csv PO_SPRING_HOV_altered.csv
HI @andrewvanbreda Looks all good to me!
Hi @andrewvanbreda,
Find here the files for the specimens from France.
Similar to what occured in Portugal, there are some specimens that should be added to the list: Syrphidae (as a family for those that have not been identified to genus level) Chrysotoxum lessonae
And for bees: Hylaeus purpurissatus Andrena afzeliella
Furthermore, in the bees and hoverflies files, there is a new column for adding comments during the identification process. This corresponds with the column that you can also find in the application. It will contain information about specimens that have been identified to subgenus or group level and are not in the list, as well as any reasons why they were not identified to the species level.
Please let me know if you need any further clarifications on these files. Thank you, iraima
Hi @iverkaik, OK thanks for the files. I won't be able to look at this until next week now because of work on urgent site upgrades, but I will let you know how I get on with both the imports.
allright @andrewvanbreda During the bext days I will also get ready the files for the samples of Spain. I just remember that there was a volunteers from France that tried to add specimens last year. Please consider that the only identifications form the pantraps are the files that I sent you. If there is information in some pantrap this was only as a test.
thank you, iraima
Hi @andrewvanbreda,
I hope the upload from France is proceeding smoothly. The list of identifications for Spain is now ready. I am just waiting for the final sample list to send it to you all, complete.
In addition to the bee species already mentioned, there are a couple more missing from the list. So, the "new" bees we need are:
Could you please provide me with the codes for these species?
Thank you, iraima
Hi @iverkaik I will add these and let you know when done. Am not sure what you mean by "codes"?, so let me know and I will try to send you the information.
I have not done any of the imports yet, I have not forgotten and they are on my task list. I am helping with upgrades relating to several websites (inluding SPRING), if not done these would affect the running of the wbesites, so I have to give that task higher priority. This is due by the end of the week, so I should have time for the import then, or before if I finish early.
Hi @andrewvanbreda
Thank you for your quick response and information. About the codes, these species do not appear on the bee list, hence they do not have an associated code. For example, species that do appear such as Panurgus banksianus has an id: 544312 or Ceratina dentiventris has an id: 543046. Hope I explianed myself better.
and good luck with the upgrades! thank you, iraima
Hi @iverkaik Yes I understand what these numbers are now, these are what we call the taxa taxon list id in the database. That is fine I know what you mean by code now. However, note that the code for Ceratina dentiventris is 543056 (543046 is Ceratina albosticta), I hope this is just a small mistake in your message and not that you had the wrong information (see attached screenshot)
you are right @andrewvanbreda, was an error copying the number, sorry
so yes, this is what I mean, if you can provide me with the ones that are missing, great!
thank you, iraima
@iverkaik I'll send you an updated list of samples and an updated taxon list
Hi @andrewvanbreda,
Find here the files for the specimens from Spain. SP_SPRING_BEE.csv SP_SPRING_BUT.csv SP_SPRING_HOV.csv
I could not add the id for the following bees are not listed:
Please let me know if you have any question! Thank you
Hello again @andrewvanbreda,
Could you please confirm whether you've initiated the uploading of the samples from France? Additional samples have been provided by volunteers, and I can send you an updated version. However, if you have already begun, I can proceed the uploading the new information through the application.
Thank you, iraima
Hi @iverkaik I have not done any of the uploads or started yet, so please provide an updated version. Thanks Andy
Allright, here then the updated from France (I deleted the ones before to avoid any confussion)
FR_SPRING_HOV.csv FR_SPRING_BUT.csv FR_SPRING_BEE.csv
In these files the taxa missing is: Syrphidae (as a family for those that have not been identified to genus level) Chrysotoxum lessonae
And for bees: Hylaeus purpurissatus
Thank you!
Hi @iverkaik
I added the following taxa Hoverflies Xanthandrus azorensis (ID 624149) Sphaerophoria nigra (ID 624150) Syrphidae (ID 624151) Chrysotoxum lessonae (ID 624152)
Bees Hylaeus purpurissatus (ID 624153) Andrena afzeliella (ID 624154) Amegilla talaris (ID 624155) Hylaeus gibbus (ID 624156)
Do not worry about editing anything you already uploaded, I can make the changes before I do imports (I won't be able to do the import today). This will avoid too many files being attached to this Github thread.
I will post the files I actually us to import here, so you will be able to see what has been imported
excellent @andrewvanbreda thank you, please let me know if you have any question, iraima
Hi @iverkaik Have done the original Portugal import now. These files 1 PO_SPRING_BUT_altered.csv 2 PO_SPRING_BEE_altered.csv 3 PO_SPRING_HOV_altered.csv
Excellent @andrewvanbreda, and thank you for the files
Hi @iverkaik Spain data has been imported as per the attached files
1. SP_SPRING_BUT_ready.csv 2A. SP_SPRING_BEE_ready.csv 2B. SP_SPRING_BEE_ready.csv 3. SP_SPRING_HOV_ready.csv
great, great news @andrewvanbreda thank you!
@iverkaik @DavidRoy France data imported. So I think that is everything complete 1. FR_SPRING_BUT_ready.csv 2. FR_SPRING_HOV_ready.csv 3. FR_SPRING_BEE_ready.csv
superb @andrewvanbreda thank you!
@DavidRoy @iverkaik Please close once you are happy with the import.
Hi @DavidRoy I have just found an unexpected consequence of the import. I have now received an email from a verifier about it (see attached). Although I am not sure why it is saying the occurrence is edited.
I also noticed another oddity. Due to records going through the verification system, but the EU PoMS (SPRING) project not actually being displayed as a survey in iRecord, the links in the email actually go nowhere (following the link says the record cannot be found)
I guess just let me know if you want me to do anything with the notification emails (e.g. forward them)
Also let me know if you want the second point looked at (e.g. raised in Github)
Data sent by email. Requirement is to upload Pan Trap occurrence data for samples already entered.