AtlasOfLivingAustralia / avh-hub

Australian Virtual Herbarium
https://avh.ala.org.au
Mozilla Public License 2.0
4 stars 2 forks source link

Conservation status #57

Closed nielsklazenga closed 6 months ago

nielsklazenga commented 8 years ago

There has been mention of an issue about the 'Conservation status' in the BioCache, but I can't find it, so, for good measure, copied from the HISCOM thread:

Ah, issue is here: biocache-store#142.

From: Hiscom-l [mailto:hiscom-l-bounces(at)chah.org.au] On Behalf Of Gillian Brown Sent: Thursday, 16 June 2016 9:45 AM To: 'hiscom-l@chah.org.au' Subject: [Hiscom-l] AVH data - conservation status help

Hello HISCOMers, A consultant has asked us if they could get conservation status from AVH data. They want to know what taxa occur in a particular area and what the state conservation status of these taxa are, and they want to be able to get it from one source. I have found the facet in occurrences that allows you to filter by it but which status is it? Is it the state it was collected in? And why are there two on some records (e.g. Endangered, Vulnerable)? I remember talking about the two statuses at the MAHC/HISCOM meeting but I do not remember the outcome, sorry.

Also, when you download the data the conservation status field is not there even though on the AVH download fields page it is listed as column 69. In my download I only get 52 columns and they are not in the same order as the AVH help page.

Thanks in advance for you help.

Gill

From: Niels Klazenga [Niels.Klazenga(at)rbg.vic.gov.au] Sent: Monday, 20 June 2016 7:33 PM To: 'hiscom-l@chah.org.au' Subject: Re: [Hiscom-l] AVH data - conservation status help

Hi all., Donna didn't hear back from me, as I was overseas on leave and am just catching up on this thread now. I might have missed one or two emails, so I apologise if I repeat and accidentally take credit for what someone else might already have said.

As Donna already said, the conservation status data as it appears in AVH was discussed at the MAHC/HISCOM meeting in Hobart in October last year. It has come up in various emails before and since, as it has been a problem right from the start. We probably understand it a bit better now than we did then.

I think the issue of the conservation status information is not one, but three different issues, all of which are bigger issues that go beyond conservation status alone. I believe a quick fix for conservation status information only will be largely unhelpful symptom treatment and that we should look into resolving the underlying issues. Here they are (in my view):

  1. Conservation status is not AVH data: it comes from the BIE (I imagine). The issue is that it is not clear to the user which fields in the BioCache are occurrence data (provided or processed) and which fields are added from other sources. The same separation we have now between the occurrence data and the spatial data that is added based on the latitude and longitude (there is room for improvement there too), we should also have between the occurrence data and the BIE data that is added based on the (processed) scientific name. This is primarily a matter of how the record detail page is set up. I saw an email about an issue Nick created (or commented on) about the design of the record detail page when I was away, so I think this is already going to happen. There is also a lot of work done on the downloads and I already saw a development version of a very fancy download page, so it is not useful to look into fields that are missing from the downloads now. We are having the same issue with the IRMNG habitat information that is added to AVH records. Until recently, this showed up on the record detail page as the processed value of the verbatim habitat information we deliver, while in fact it was information that came from somewhere else. In AVH, we have called it 'biome' now, but it is still not clear (to AVH users) where it comes from. Establishment means is in AVH currently only as occurrence data. It is rather spottily delivered – I personally think it should only be delivered if someone has bothered to fill it in. It would be nice, once the next issue has been resolved, to get establishment means from the BIE as well, so we can have it for all records, but it needs to be in different fields and it needs to be clear what is what. Hopefully in future we can also get profile data, such as life form, from the BIE into AVH.
  2. Some of the information from the BIE, including conservation status, does not relate to a whole taxon, but to a taxon in a specific geographic area. The geographic aspect is currently ignored (not only in the BIE, also in the NSL). We are now encountering an issue with misapplied and excluded taxa. When I did a search for records of mosses belonging to excluded taxa, I ended up with only records from New Zealand (of species that do occur in New Zealand). Doug is aware of the issue – he brought it up himself when I handed him the moss name data – and I am sure that he and Dave, when further developing the BIE, will find (and implement) a solution. I hope that when they do, they find a general solution for all information from BIE that has a geographic component. A taxon that is native to Western Australia may be introduced in Victoria. Likewise, the conservation status for a taxon in Tasmania does not apply to a record of that taxon from Queensland and EPBC does not apply to records from New Zealand. By the way, the SDS already deals rather well with both the taxonomic and geographic components. On the surface I see a similarity there, so it might be useful to look at the SDS, but the code base might be entirely different.
  3. When discussing this at the HISCOM meeting, we first noticed (it was a first for me at least) that the BIE gives both 'status' and 'source status' for EPBC and state conservation statuses and (for EPBC) these are sometimes different (as Gill points out in her email as well; for some states they are always different). These fields come from the lists (http://lists.ala.org.au/speciesListItem/list/dr656 for EPBC). The source status is what you'll find in the EPBC and the state lists, but where do the values in the status column come from and how can they be different? It is those values that end up in the BioCache (I think), which is what I think I remember brought on the discussion we had at the HISCOM meeting (I have CC-ed in Frank Zich, as I think it was he who brought it up then). The easy "solution" would be to just ditch the 'status' field and use 'source status' instead, but the deeper issue that is going to fester is that there is insufficient metadata for the lists and it is not documented what the columns mean and where the data comes from. So I think it would be more constructive to look first into adding metadata to the lists – analogous to the eml.xml and meta.xml files in a DwC Archive. Then we can find out how conservation status is derived and whether those differing values (for EPBC) are indeed wrong (I can't see how they cannot be, but still) and then decide what to do next. The lists, while maybe not all that flashy by themselves, are a very important part of ALA. Many other parts of ALA rely on it and that will only increase with ALA growing. If we can resolve a problem by improving the lists, we are probably going to resolve a lot of other problems and prevent future ones. Adding list and column metadata is the biggest improvement to the lists I can think of, besides resolving part of the conservation status issue.

I thought this would be a quick email...

Niels

From: Donna Lewis [Donna.Lewis(at)nt.gov.au] Sent: Thursday, 23 June 2016 10:06 AM To: 'hiscom-l@chah.org.au' Subject: Re: [Hiscom-l] AVH data - conservation status help

Hi All, Nick – thanks for creating an issue for this to be dealt with in the next ALA maintenance sprint in July.

Niels – thanks for your response below. I completely agree that conservation status is not AVH data as conservation status relates to a taxon, not a specimen. I think we should continue to include it (conservation status) in AVH, but it must be presented correctly as the data has many implications for a range of uses, especially if a species has a ‘threatened status’.

Currently AVH presents conservation status for a specimen record as follows:

{
    "State conservation": "Endangered,Vulnerable",
    "Country conservation": "Endangered,Vulnerable"
}

Two main issues outlined below (I’m sure there are more):

  1. State conservation and Country conservation fields provide both state and Australian legislation (coma separated), but what state legislation does it provide? The field does not indicate whether it is QLD, NT, WA etc. Also, why does it display both State and Australian legislation in both fields? A number of examples identified by Gill and I:
    • Acacia peuce: This is a BRI record however, the state conservation presents both NT legislation and EPBC. The QLD legislation is not displayed. So we need to decide, do we provide all State/Territory and Australian legislation like the ALA does eg. http://bie.ala.org.au/species/Acacia+peuce. OR depending on where the record is from, should it be the state legislation for that record, plus EPBC?
    • Cycas armstrongii: For this NT record there is only NT legislation, however looking at the record, 2 categories are displayed for ‘State conservation’.
  2. Rename fields. State conservation/Country conservation implies it is a conservation status for all taxa that have been assessed. If however, AVH displays only categories of threat (Critically Endangered, Endangered & Vulnerable - according to the IUCN Red List Categories), then the fields should be renamed more appropriately i.e. ‘State threatened status/Country threatened status).

I agree with Niels in that we need to fix this issue properly, not a quick fix.

Nick – is this issue bigger than dealing with it in the next maintenance sprint?

Cheers

Donna

I hadn't noticed the 'Country conservation' on the record detail page before (it hasn't always been there, I think). From Donna's examples above, it appears that the correct list, i.e. the list from the state (or territory) the record is from, is used for the state conservation. Only, in both the state and country conservation fields, the 'status' and 'sourceStatus' from the relevant list are concatenated. Nobody seems to know where the values in the 'status' column in the lists come from or how they are derived. So the issue that Gill originally reported can probably be easily fixed by deleting the 'status' columns from the lists (just not using them might be a less radical solution). We can then reference the lists in the AVH Help.

nickdos commented 7 years ago

TLDR;

nickdos commented 7 years ago

I've linked the status value to the state and its species list (where data originated)...

How does this look?

http://avh-test.ala.org.au/occurrences/5529c0cc-c273-4cdb-aead-d032509251f5

image

nielsklazenga commented 7 years ago

It should be linked to sourceStatus only. The status column in the conservation lists is what has always been causing the confusion. Nobody knows where the data in the status column comes from and there seems to be no relationship between the value in the sourceStatus and status columns. For Acacia hystrics subsp. continua (of the example), the translation is not too bad, although 'data deficient' would have been better (and is an IUCN threat status), but for the actually listed taxa, values that were already IUCN threat statuses have been translated into others and we don't know on what grounds.

nickdos commented 7 years ago

The 2 values are the verbatim value (provided by the state) and a matched value (so we can allow searches across states to be made). The states all use different (source) status values, so we try to match to a common set of terms (IUCN maybe), so that the user does not need to know the particulars of each state's terms and their meanings.

The lists are maintained by the states but I don't know how the "common" status values are generated and by whom. @M-Nicholls handles this, so I'll try and find out.

It would be better to show both values but clearly indicate where they come from and what they mean, plus correct any problems with the values (as opposed to just removing it from the display).

M-Nicholls commented 7 years ago

From looking at the lists and going through some of the older emails about this I think we should remove the ALA mapped "status" field, just keep the sourceStatus field. People can use the national EPBC list for overall querying if they want to do that.

Any objections?

nielsklazenga commented 7 years ago

Thanks Miles. Definitely the best solution. No objections from me.

DNAdl8 commented 7 years ago

I agree too - less complicated. So to clarify, only the 'state conservation' will be made available - not EPBC? I like how the State/Territory is clearly stated in that field

ryonen commented 7 years ago

Ummm, shouldn't both State/Territory and National (EPBC) be displayed (clearly indicated, as per Nick's comment) as they're not always the same? E.g. the NT endemic Atalaya brevialata, which is listed as Data Deficient in the NT and Critically Endangered in EPBC. If only State/Territory was displayed this would just come up as DD in ALA/AVH. Is the expectation that the user would then independently search EPBC to check any listing there? I query whether they would do so for a taxon presented as DD.

DNAdl8 commented 7 years ago

I think we agreed in the HISCOM Teleconference 26/06/17 that 'Country Conservation' should be visible again, not just 'State Conservation'.

nickdos commented 7 years ago

I'll make it happen. Country conservation wasn't being indexed when I fixed the display of state conservation but I'll check again now.

adam-collins commented 6 months ago

Should be there somewhere today.