matildabrown / rWCVP

Generating Summaries, Reports and Plots from the World Checklist of Vascular Plants
https://matildabrown.github.io/rWCVP/
GNU General Public License v3.0
19 stars 0 forks source link

Doubtful locations in distribution plots #46

Closed alrichardbollans closed 1 year ago

alrichardbollans commented 1 year ago

Plotting the following distributions with:

library(rWCVP)

distribution <- wcvp_distribution("Gentianales", taxon_rank="order", location_doubtful = FALSE)
wcvp_distribution_map(distribution)
distribution <- wcvp_distribution("Rubiaceae", taxon_rank="family", location_doubtful = FALSE)
wcvp_distribution_map(distribution)
distribution <- wcvp_distribution("Aidia pycnantha", taxon_rank="species", location_doubtful = FALSE)
wcvp_distribution_map(distribution)

returns: Gentianales: Gents_plot

Rubiaceae: Rubiaceae

Aidia pycnantha: Aidia pycnantha

The maps for the family and order this produces seem incorrect to me i.e. there are certainly gentianales (not doubtful!) in all/most of africa and south america, indonesia, china etc.. For example, Aidia pycnantha in Rubiaceae is found in east China and this shows up in the Rubiaceae and Aidia plots but not the Gentianales plot. Similarly, more regions are marked in the Rubiaceae plot than the Gentianales plots, which seems counterintuitive and there are also lots of regions in the Rubiaceae plot which I think should be marked as 'native'.

Maybe this is the intended behaviour, but I suspect location_doubtful = FALSE is removing those regions that are doubtful for any species when it should(?) remove those regions that are doubtful for all species.

barnabywalker commented 1 year ago

I think we discussed this before, then maybe never implemented a solution.

The case of deciding between native and the other occurrence types seems obvious (so e.g. if some species in a genus/family/order are native to a region but some are location_doubtful/introduced/extinct it should show as native), but not sure how to decide between the others.

I feel like the priority should run native > introduced > extinct > location_doubtful - does that sound right?

alrichardbollans commented 1 year ago

I think this really depends on individual purpose and above the species level it gets complicated having a mix of statuses. In general I would agree that native > introduced. However, as far as I understand, where native and introduced are separate states, extinct and location_doubtful are modifiers of those states e.g. a species can be doubtful in a location where is is presumed to be introduced, or extinct in a location where is was native/introduced.

I would be inclined to just label regions based on native/introduced states and maybe a cross-hatching colour for where both native and introduced species are. I think it gets quite messy trying to include extinct and doubtful info above the species level.

barnabywalker commented 1 year ago

I think our intention with wcvp_distribution was that it would replicate the distributions displayed on POWO.

Looking at POWO, for individual species, it looks like:

For genera:

So, I'll fix wcvp_distribution to follow that for the taxonomy from family upwards, as well, I think?

matildabrown commented 1 year ago

That sounds perfect @barnabywalker - I love the idea of cross-hatching @alrichardbollans but the intention is definitely to replicate the POWO mapping (though other themes in future versions is a tempting idea...).

Could you add a test for your fix too, Baz?

alrichardbollans commented 1 year ago

@barnabywalker yep this sounds good!

barnabywalker commented 1 year ago

Just adding tests for this, but I think it should be fixed now. Maps below - Gentianales and Rubiaceae showing up pretty much everywhere, which I think is right?

Gentianales: gentianales-distribution

Rubiaceae: rubiaceae-distribution

Aidia pycnantha: aidia-pycnantha-distribution

barnabywalker commented 1 year ago

I've added the fix and an associated test in #48, but it would be good for someone to double-check it works as expected before merging.