emo-bon / governance-data

Holds the governance content for the emo-bon data management
0 stars 0 forks source link

Swap size_frac_low and up #21

Closed kmexter closed 1 month ago

kmexter commented 4 months ago

Raising this issue here as it is to be done for all the observatory googlesheets. See https://github.com/emo-bon/observatory-esc68n-crate/issues/5 for background

The order of business is

  1. swap these column titles in all water and sediment googlesheets - @melinalou @melanthia to do and @kmexter to check and confirm
  2. change the description in the "definitions" tab for all logsheets - @melinalou @melanthia to do and @kmexter to check and confirm
  3. change the source_mat_id final column in all sampling tabs of all water googlesheet: in the equation there, col R should be changed to col Q (change the equation for the first row and drag-drop down to replace values in all rows) (note that this is not part of the id for sediment, so nothing needs to change here) - @melinalou @melanthia to do and @kmexter to check and confirm
  4. @laurianvm to change the definition for these terms in the ontology and the ttl template (the definitions should come from: water take it from the checklist https://www.ebi.ac.uk/ena/browser/view/ERC000024, sediment from the checklist https://www.ebi.ac.uk/ena/browser/view/ERC000021
  5. @kmexter to change the logsheet_schema_extended to change the definition there
  6. @kmexter to check if the QC also needs to change (was a check done on the low being > the high? if so, needs to change)

Only after point 6 is this issue solved.

Can we start with points 1,2,3 please? When done, comment in this issue so the next person knows they have to start their part. Note: Katrina will be away until Aug 19.

kmexter commented 3 months ago

@melinalou @melanthia (or possibly @cpavloud) can you tell me if you have done this swap in the logsheets - i.e. swapped the column titles around and changed the definition in the definition tab (which requires creating a new definitions tab as a copy of the old one, otherwise you cannot edit it). I do not want to change this in the governance data until it is changed in the logsheets.

melinalou commented 2 months ago

Hi! The swap of size_frac_low and size_frac_up columns has been done in all logsheets and the definitions are:

size_frac_low size-fraction lower threshold Refers to the mesh/pore size used to pre-filter/pre-sort the sample. Materials larger than the size threshold are excluded from the sample

size_frac_up size-fraction upper threshold Refers to the mesh/pore size used to retain the sample. Materials smaller than the size threshold are excluded from the sample

Unfortunately I have permission to create a new copy of the definitions sheet but not to rename or delete the old one.. But do they need to change?

kmexter commented 2 months ago

OK, so the columns have been swapped in all Sampling tabs, but yes, you do need to swap the definitions also. I would do that in a new definitions sheet and then delete the old definitions sheet. BUT clearly we also need to change the way the source_mat_id is created in all the Sampling tabs, final column - see my comments in https://github.com/emo-bon/observatory-profile/issues/13. Otherwise all the IDs are malformed. Christina did say this in an email some time back, so I think we don't need to check with her - you can just go ahead and do it. For that, you need to copy-past the Sampling tabs so you can modify that column - as we discussed last week during our meeting. Do you also need help changing the equation? Christina did say how to do that in her email sent some time ago.

cpavloud commented 2 months ago

@melinalou this is wrong, these are the old definitions.

size_frac_low is used to retain the sample size_frac_up is used to pre-filter/pre-sort the sample

See also here for the ongoing discussion we have with the GSC to correct officially the definitions.

melinalou commented 2 months ago

@kmexter Ok, yes I will change the definitions but as I can see you (or someone with permission) will need to delete the old definitions sheets after I create the new corrected one because it is locked and I can not delete it.

Referring to the source_mat_id is this the equation that need to be fixed? https://github.com/emo-bon/observatory-hcmr-1-crate/issues/13 To put _1 in all blanks and _2 whenever it is another one? (or it has to do with the size_frac_up?) I think it is not only that change if we need unique source_mat_ids cause e.g in https://docs.google.com/spreadsheets/d/11_Eu0W1-sDiuzKx1cIl6YuxjRHmWezN6u9v3Ly8JZ3A/edit?gid=124596284#gid=124596284 we have the same id for line 6 and 16 even if there no blanks in M column. Sorry if I didn't understand well.

melinalou commented 2 months ago

@cpavloud thank you!

kmexter commented 2 months ago

No, the blanks bit is another issue - don't do that one yet, or we will get confused What needs changing is size_frac_low is in Q column size_frac_up is in R column

The equation is =CONCATENATE(observatory!$A$2,"",H2,"",R2,"um","_",M2) (in row 2, of course the other rows have 3,4,5 etc instead of 2) And the source_mat_id for the first sample is EMOBON_ROSKOGO_Wa_210618_200um_1

But is should be =CONCATENATE(observatory!$A$2,"",H2,"",Q2,"um","_",M2) So that the source_mat_id for the first sample would be EMOBON_ROSKOGO_Wa_210618_3um_1

melinalou commented 2 months ago

ok, I will fix it and let you know! Thank you.

melinalou commented 2 months ago

I made a copy of sampling where I 've fixed the source_mat_id and a copy of definitions where I 've changed the size_frac_up/low and n_alkanes definitions. So now we need to delete the old ones and rename the new.

kmexter commented 2 months ago

For which googlesheet did you do this - so I can check? paste the URL here please

melinalou commented 2 months ago

All of them here https://github.com/emo-bon/governance-data/blob/main/logsheets.csv

kmexter commented 2 months ago

OK, I checked 2 and they are nice. However, you will need to remove the original Sampling tab and rename the "Copy of sampling" to "sampling" ->otherwise those will not be harvested (as we harvest on the name of the tab). Perhaps rename "Copy of definitions" to "Updated definitions" When you do remove the old sampling tab, you should make sure to copy over any comments that are still alive in there. Unfortunately comments get lost when you copy in the way I told you to, you see, but comments that are still alive are ones that the stations still need to do.

Many of my comments in https://github.com/emo-bon/observatory-profile/issues/13 will now be solved - those related to size_frac_up and low - so bear that in mind as you work your way thru that issue.

Great work - boring I know, but it needs to be done!

melinalou commented 2 months ago

Yes but I can not delete the sampling and definitions. Unfortunately I do not have the permission..

kmexter commented 2 months ago

indeed - hence rename the "Copy of definitions" so it is clear that it is an update, not a copy Hmm, so I could remove the "sampling" tab before, I am sure. Perhaps @melanthia has permissions if you do not? But what you surely can do is copy over the comments that are still relevant? Most are raised by HQ and I cannot tell if they can be closed or not, so better you do it

melinalou commented 2 months ago

Good.I will rename the copy of.. to Updated definitions and copy the comments. I will inform you here when it is done.

melinalou commented 2 months ago

All done! https://github.com/emo-bon/governance-data/blob/main/logsheets.csv I renamed the copy of sampling -> new sampling and copy of definitions-> updated definitions. Also I copied all the comments. Please check if the way I did it is helpful..cause I could only do copy and paste all the "chat".

cymon commented 2 months ago

Hi Folks,

Can I just get a summary of what happened here?

a) the definitions of size_frac_low and size_frac_up were changed and the definitions updated to "Updated definitions" b) the sampling sheets were changed to use size_frac_up in the source_matid formatter: =CONCATENATE(observatory!$A$2,"",H2,"",Q2,"um","",M2) ie Q2 instead of the old R2 c) the "sampling" sheet was re-named to "new sampling"

Is that correct? and has this happened to every observatory logsheet?

(BTW it would have been easier all 'round if the old sheets were labeled as old, and the new sheets keep their original names, but I realise that isnt always possible.)

Regards, Cymon

On Tue, 27 Aug 2024 at 10:08, melinalou @.***> wrote:

All done! https://github.com/emo-bon/governance-data/blob/main/logsheets.csv I renamed the copy of sampling -> new sampling and copy of definitions-> updated definitions. Also I copied all the comments. Please check if the way I did it is helpful..cause I could only do copy and paste all the "chat".

— Reply to this email directly, view it on GitHub https://github.com/emo-bon/governance-data/issues/21#issuecomment-2311975017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAS6V6RZ4TXRL34YHASJ7LZTQ6ZRAVCNFSM6AAAAABK4JV7RKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJRHE3TKMBRG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--


Cymon J. Cox

Senior Researcher Plant Systematics and Bioinformatics Digital Laboratory Centro de Ciencias do Mar (CCMAR) - CIMAR-Lab. Assoc.

Mailing address: CCMAR - Centro de Ciencias do Mar, Universidade do Algarve Campus de Gambelas Edif. 7 8005-139 Faro Portugal

Phone: +351 289800051 ext 7380 Fax: +351 289800051 Email: @.***

@CCMAR https://ccmar.ualg.pt/users/cymon Google Scholar https://scholar.google.co.uk/citations?user=f5M7DhkAAAAJ&hl=en&oi=ao Scopus http://www.scopus.com/inward/authorDetails.url?authorID=7402112716&partnerID=MN8TOARS
Orcid http://orcid.org/0000-0002-4927-979X CienciaVitae

https://www.cienciavitae.pt/6B15-9771-1D04 GPG: Public key on keyserver.ubuntu.com


kmexter commented 2 months ago

OK, so as far as I can see that is good However, it will be necessary to ASAP remove the "sampling" tab and rename "sampling new" to "sampling", otherwise (1) the source_mat_id in the measured tab (col 1) will be wrong and (2) the stations will be confused over which tab to fill in. Since @melinalou apparently does not have permission to do this, does @cpavloud or @melanthia have the necessary permissions?

cpavloud commented 2 months ago

I don't have permissions, no. I have tried to use the EMBRC secretariat credentials (that should work) but they don't. And at this point, no one knows which is the correct password for this account. I don't think we can do anything else rather than wait for @isanti to come back from her leave.

kmexter commented 2 months ago

You cannot edit the "Updated definitions" tab? That is there in all the water logsheets now.

cpavloud commented 2 months ago

Ah, I tought you meant the original "definitions" tab (which is locked). I can edit the "Updated definitions" tab, yes. But probably everyone of us could do that...

melinalou commented 2 months ago

I think what we want is access to sampling Tab that is locked and definitions tab also in order to delete them and rename the new tabs accordingly.

On Wed, 28 Aug 2024, 11:39 Christina Pavloudi, @.***> wrote:

Ah, I tought you meant the original "definitions" tab (which is locked). I can edit the "Updated definitions" tab, yes. But probably everyone of us could do that...

— Reply to this email directly, view it on GitHub https://github.com/emo-bon/governance-data/issues/21#issuecomment-2314693961, or unsubscribe https://github.com/notifications/unsubscribe-auth/A352YIDCHRGLMA3YPQHCOADZTWEFNAVCNFSM6AAAAABK4JV7RKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJUGY4TGOJWGE . You are receiving this because you were mentioned.Message ID: @.***>

kmexter commented 2 months ago

Yes, EMOBON HQ should edit the Updated definitions Since it appears that I do have permission to remove the old sampling tab and rename the new one, I will do that later today

kmexter commented 2 months ago

@melinalou So neither do I have permission to remove the original sampling tab. sigh HOWEVER, I could do this, and you could do the same. I edited this one: https://docs.google.com/spreadsheets/d/1hvLkBwiKTGTJDx19m_8e7qJ2lm9bwLLeVztMpxTLqnk/edit?gid=15718907#gid=15718907 I RENAMED the original sampling tab to "old sampling - ignore" Then I RENAMED the new sampling tab to "sampling" Then I EDITED the equation in column 1 of the measured tab. It takes the column AM from the sampling tab, but when I renamed that tab, it also changed the equation from "=sampling!AM2" to "=old sampling - ingore!AM2" so I had to change it BACK to what it was before Then finally I MOVED the tabs so that observatory, sampling, and measured were tabs 2,3,4, Updated defniitions was 5, and the tabs to be deleted are then pushed to the end

IS that clear? Can you do that for the other water logsheets?

melanthia commented 2 months ago

Hi Katrina, This is my first day back to work after my summer holidays. I will check all your e-mails and try to do the corrections/edits requested. I will keep you updated with the process.

Cheers, Melina

Dr. Melanthia Stavroulaki Research fellow at IMBBC [https://user-generated.getmailspring.com/asset/NWRmYmZiZWQtMTJiZC00ZGM0LWI3N2MtYTUxZDllYTgzY2Y0L3NpZy1sb2NhbC0xMjZiMzBkZC02YmZiLnBuZw.png?t=1593498922149&msw=249&msh=221] E-MAIL: @.*** PHONES: (+30) 2810337719, (+30) 6971602428 ADDRESS: Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC) Hellenic Center for Marine Research (HCMR) Thalassokosmos, P.O.Box 2214 Gournes, Heraklion, Crete GREECE WEBSITE: http://www.imbbc.hcmr.gr/

Ocean Sampling Day (OSD) Communication Coordinator National Communication Officer for EMBRC Operational Contact for EMO BON (GR)


From: Katrina Exter @.> Sent: Wednesday, August 28, 2024 11:52 AM To: emo-bon/governance-data @.> Cc: Melanthia Stavroulaki @.>; Mention @.> Subject: Re: [emo-bon/governance-data] Swap size_frac_low and up (Issue #21)

@melinalouhttps://github.com/melinalou So neither do I have permission to remove the original sampling tab. sigh HOWEVER, I could do this, and you could do the same. I edited this one: https://docs.google.com/spreadsheets/d/1hvLkBwiKTGTJDx19m_8e7qJ2lm9bwLLeVztMpxTLqnk/edit?gid=15718907#gid=15718907 I RENAMED the original sampling tab to "old sampling - ignore" Then I RENAMED the new sampling tab to "sampling" Then I EDITED the equation in column 1 of the measured tab. It takes the column AM from the sampling tab, but when I renamed that tab, it also changed the equation from "=sampling!AM2" to "=old sampling - ingore!AM2" so I had to change it BACK to what it was before Then finally I MOVED the tabs so that observatory, sampling, and measured were tabs 2,3,4, Updated defniitions was 5, and the tabs to be deleted are then pushed to the end

IS that clear? Can you do that for the other water logsheets?

— Reply to this email directly, view it on GitHubhttps://github.com/emo-bon/governance-data/issues/21#issuecomment-2314721569, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BFY5CVYHAX243F2WZPW265TZTWFUFAVCNFSM6AAAAABK4JV7RKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJUG4ZDCNJWHE. You are receiving this because you were mentioned.Message ID: @.***>

melinalou commented 2 months ago

@kmexter Yes, I will do it for the other logsheets.

melinalou commented 2 months ago

@kmexter all done. (I named the old sampling tab "old-sampling").If you want have a look at one random, e.g https://docs.google.com/spreadsheets/d/1AvQMYcS0tdNMw6Er8zUarQg1a_wrshhnkTS6RuI1FJQ/edit?gid=15718907#gid=15718907 to be sure that it is ok.

kmexter commented 2 months ago

Can someone please tell me if I have chosen the correct BODC terms for these two properties size_frac_low http://vocab.nerc.ac.uk/collection/P01/current/PRSZSPLW/ (I am unsure in particular because this defnition say "retained" while the ENA definition says "excluded" size_frac_up https://vocab.nerc.ac.uk/collection/P06/current/UXMM/ I think @cpavloud understands this best....

cpavloud commented 2 months ago

The BODC term for size_frac_low is Pore size of sampling processor (lower filter)

The BODC term for size_frac_up is Pore size of sampling processor (upper filter)

kmexter commented 2 months ago

phew, so I got it right. So, of the list at the beginning, can you all confirm that we have

  1. swapped the column titles in all logsheets (all observatories, water and sediment)
  2. change the definitions in the logsheets -> _I dont think the examples in the definition tab have been changed, as it says 200 as an example for size_fraclow!
  3. source_mat_id equation has been changed
  4. I will check with laurian that these have changed in the ontology
  5. logsheet schema has been updated - yes
  6. QC updated - not done yet
melinalou commented 2 months ago

Good morning! 1 and 3 done! As for the second one I will take a look now and change the examples. I will let you know.

kmexter commented 2 months ago

I checked - 4 is done. we used the ENA definition in the ontology.

melinalou commented 2 months ago

2 checked and done.

kmexter commented 2 months ago

ok, only 6 left that that is for VLIZ to do

kmexter commented 1 month ago

see https://github.com/emo-bon/observatory-profile/issues/7 and https://github.com/emo-bon/data-quality-control-action/issues/8 for point 6,