Open andersfi opened 1 month ago
So I had a look at the data and they seem to make some sense. Inside the GDB file there are 6 tables:
ANO_SurveyPoint
- point layer - individual points surveyedANO_Flate
- polygon layer - encompassing ANO_SurveyPoint
but some plots have no ANO_SurveyPoint
entryANO_Art
- simple table - list of species of vascular plants found on the ANO_SurveyPoint
. connected using ParentGlobalID(ANO_Art
) > GlobalID (ANO_SurveyPoint
)ANO_FremmedArt
- same as ANO_Art but for alien speciesANO_ProblemArt
- same as ANO_Art but for invasive speciesANO_Treslag
- same as ANO_Art but for tree speciesSo If I understand it correctly, there should be a parent event for ANO_Flate
which would have child events for ANO_SurveyPoint
and each ANO_SurveyPoint
event would have occurrences of species (species + alien species + invasive species + tree species)
NiN mapping and other measurements would be EMoF extension.
Does that sound reasonable?
I have a question about art_dekning
column in ANO_Art
table. I understand that this should be coverage on the plot but some numbers there seem to be in the form of 0.1 and sometimes there is a round number like 24. it seems to me that 0.1 should actually mean 10%.
EDIT: So I went a bit deeper. and I'm not sure if this is a mistake or not. There is an obvious bias towards 0.1 in particular as you can see on the histogram here: It might mean that if there is only one specimen on the plot, surveyors choose 0.1% coverage.
I am not sure why this bias towards 0.1. This seems odd - I also interpreted this as coverage in %. Maybe as simple as that those doing the mapping put in 0.1 as default value if the coverage is very slim and close to 0?? I think we need to get in touch with the data owner to clarify.
So I had a look at the data and they seem to make some sense. Inside the GDB file there are 6 tables:
ANO_SurveyPoint
- point layer - individual points surveyedANO_Flate
- polygon layer - encompassingANO_SurveyPoint
but some plots have noANO_SurveyPoint
entryANO_Art
- simple table - list of species of vascular plants found on theANO_SurveyPoint
. connected using ParentGlobalID(ANO_Art
) > GlobalID (ANO_SurveyPoint
)ANO_FremmedArt
- same as ANO_Art but for alien speciesANO_ProblemArt
- same as ANO_Art but for invasive speciesANO_Treslag
- same as ANO_Art but for tree speciesSo If I understand it correctly, there should be a parent event for
ANO_Flate
which would have child events forANO_SurveyPoint
and eachANO_SurveyPoint
event would have occurrences of species (species + alien species + invasive species + tree species) NiN mapping and other measurements would be EMoF extension.Does that sound reasonable? Yes, this sounds reasonable. If I understand it rigth, there is different sampling methods on "species", "alien species", "tree species" and "NiN mapping". Accordingly this sounds like own events? Should not be technical difficult to sort out, however, we lack a GUID for identifying this event. Maybe we should use composite identifier for these events instead of adding a GUID? Need to discuss this with data-owner(?)
So I had a look at the data and they seem to make some sense. Inside the GDB file there are 6 tables:
ANO_SurveyPoint
- point layer - individual points surveyedANO_Flate
- polygon layer - encompassingANO_SurveyPoint
but some plots have noANO_SurveyPoint
entryANO_Art
- simple table - list of species of vascular plants found on theANO_SurveyPoint
. connected using ParentGlobalID(ANO_Art
) > GlobalID (ANO_SurveyPoint
)ANO_FremmedArt
- same as ANO_Art but for alien speciesANO_ProblemArt
- same as ANO_Art but for invasive speciesANO_Treslag
- same as ANO_Art but for tree speciesSo If I understand it correctly, there should be a parent event for
ANO_Flate
which would have child events forANO_SurveyPoint
and eachANO_SurveyPoint
event would have occurrences of species (species + alien species + invasive species + tree species) NiN mapping and other measurements would be EMoF extension. Does that sound reasonable? Yes, this sounds reasonable. If I understand it rigth, there is different sampling methods on "species", "alien species", "tree species" and "NiN mapping". Accordingly this sounds like own events? Should not be technical difficult to sort out, however, we lack a GUID for identifying this event. Maybe we should use composite identifier for these events instead of adding a GUID? Need to discuss this with data-owner(?)
I mean we don't really have any additional info for the events (except for the info we already have in ANO_SurveyPoint
) and we can specify the sampling method on the record level instead of the event level - that would simplify the overall structure (and would look better on the dataset page after publication)
But we can do it like you said as well of course :) we need to have a meeting with data-owner
I remember asking Ole Einar about the 0,1 value for dekning, because it puzzled me when i imported the survey data: "0,1% stemmer. De setter den verdien når de finner typ ett individ av en liten art. Har fått innspill på at de ønsker å kunne sette 0,1 % kontra 1%. Derfor vil eldre data ha 1 % som laveste dekning." translated: "0,1% is correct. They [the surveyors] use this value when they find a single specimen of a small species. They would rather use 0,1% than 1%. Because of this older data will have 1% as the lowest value for dekning".
I discovered this when importing survey data from 2023, so I basically think that means they used 1% for 2019-2022 and 0,1% in 2023.
Correct, some ANO_Flate have no ANO_SurveyPoint. From what I remember this is because the Flate polygons are randomly chosen and a few of them are in areas where no point can be surveyed (a lake, a very steep mountain side etc).
So I had a look at the data and they seem to make some sense. Inside the GDB file there are 6 tables:
ANO_SurveyPoint
- point layer - individual points surveyedANO_Flate
- polygon layer - encompassingANO_SurveyPoint
but some plots have noANO_SurveyPoint
entryANO_Art
- simple table - list of species of vascular plants found on theANO_SurveyPoint
. connected using ParentGlobalID(ANO_Art
) > GlobalID (ANO_SurveyPoint
)ANO_FremmedArt
- same as ANO_Art but for alien speciesANO_ProblemArt
- same as ANO_Art but for invasive speciesANO_Treslag
- same as ANO_Art but for tree speciesSo If I understand it correctly, there should be a parent event for
ANO_Flate
which would have child events forANO_SurveyPoint
and eachANO_SurveyPoint
event would have occurrences of species (species + alien species + invasive species + tree species) NiN mapping and other measurements would be EMoF extension. Does that sound reasonable? Yes, this sounds reasonable. If I understand it rigth, there is different sampling methods on "species", "alien species", "tree species" and "NiN mapping". Accordingly this sounds like own events? Should not be technical difficult to sort out, however, we lack a GUID for identifying this event. Maybe we should use composite identifier for these events instead of adding a GUID? Need to discuss this with data-owner(?)I mean we don't really have any additional info for the events (except for the info we already have in
ANO_SurveyPoint
) and we can specify the sampling method on the record level instead of the event level - that would simplify the overall structure (and would look better on the dataset page after publication)But we can do it like you said as well of course :) we need to have a meeting with data-owner
Well, I am very happy with compromises and everything that makes life simpler - however, we need to be able to pinpoint a taxonomic scope to the various events. Feks the "invasive species" and "tree species" will have different taxonomic scope and I can't figure out how to document this except on the Humbolt extention?
A request relayed from The Norwegian Environmental Agency (MDIR) on possibilities to publish the Area Representative Monitoring data (Arealrepresentativ naturovervåkning - ANO) on GBIF.
This is a very important dataset for both research and management and we should prioritize to help this get out.
The data are available for download in a .gdb format from MDIRs homepage (they promise stable URL and stable data structure). The dataset is updated once a year. I think the mapping is fairly straight forward, but with some small issues (mainly related to hiearcial sampling design and IDs). A suggestion is to facilitate this and speed up the publishing by putting up a pipeline for mapping the data from the .gdb database to a dwc-a and publish on GBIF.no's IPT.
A document describing the dataset and tentative mapping is found here: https://docs.google.com/document/d/1ozhrI2xdN5dK0FgiQ-vBE-_NNEXaAGrLhJflKzUK9Dw/edit?usp=sharing (sorry, only in Norwegian, mainly used for communication with MDIR until now).