inbo / alien-species-portal

Portal for alien and invasive species indicators
MIT License
0 stars 0 forks source link

Grafiek: aantal geïntroduceerde uitheemse soorten per jaar per regio van oorsprong #46

Open SanderDevisscher opened 4 years ago

SanderDevisscher commented 4 years ago

Gebaseerd op deze indicator: https://trias-project.github.io/indicators/indicator_introductions_per_year.html maar in de stijl van FIGUUR: Gerapporteerd aantal per jaar en per regio van de grofwildjacht pagina. Dus per jaar een stacked bar chart met de verschillende regios

Met deze data: https://github.com/trias-project/indicators/blob/master/data/interim/data_input_checklist_indicators.tsv

en de volgende selectie opties:

mvarewyck commented 4 years ago

@timadriaens @damianooldoni "per regio van oorsprong", is this variable "native range" in the data? Because I found 150 different values for that - which might not be ideal to show in stacked barplot

> unique(exotenData[, "native range"])
                                            native range
  1:                                                    
  2:                                       North America
  3:                                 North Pacific Ocean
  4:                            Eastern Asia (WGSRPD:38)
  5:                                   Europe (WGSRPD:1)
 ---                                                    
146:                                             Eurasia
147:                                       central Italy
148:                                       western Italy
149: northern coastal areas of the western Mediterranean
150:                            Mediterranean & Portugal

Also, multiple values are separated by "," or "&", not by "|" as indicated https://docs.google.com/spreadsheets/d/1LeXXbry2ArK2rngsmFjz_xErwE1KwQ8ujtvHNmTVA6E/edit#gid=808615322

SanderDevisscher commented 4 years ago

@damianooldoni @mvarewyck

How hard is it to translate native range into this ? analog to habitat being translated into marine, freshwater, terrestrial.

native range Azië Afika Noord-Amerika Zuid-Amerika Antartica Europa Oceanië
North America FALSE FALSE TRUE FALSE FALSE FALSE FALSE
North Pacific Ocean FALSE FALSE FALSE FALSE FALSE FALSE TRUE
Eastern Asia (WGSRPD FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Europe (WGSRPD FALSE FALSE FALSE FALSE FALSE TRUE FALSE
Eurasia TRUE FALSE FALSE FALSE FALSE TRUE FALSE
central Italy FALSE FALSE FALSE FALSE FALSE TRUE FALSE
western Italy FALSE FALSE FALSE FALSE FALSE TRUE FALSE
northern coastal areas of the western Mediterranean FALSE FALSE FALSE FALSE FALSE TRUE FALSE
Mediterranean & Portugal FALSE FALSE FALSE FALSE FALSE TRUE FALSE
SanderDevisscher commented 4 years ago

Also, multiple values are separated by "," or "&", not by "|" as indicated https://docs.google.com/spreadsheets/d/1LeXXbry2ArK2rngsmFjz_xErwE1KwQ8ujtvHNmTVA6E/edit#gid=808615322

@damianooldoni can you look into this ?

timadriaens commented 4 years ago

yes, that's one of the problems with the current checklist, in principle they should have been mapped at checklist level using the World Geographical Scheme for Recording Plant Species Distribution (WGSRPD, https://web.archive.org/web/20160125135239/http://www.nhm.ac.uk/hosted_sites/tdwg/TDWG_geo2.pdf) and the UN geoscheme (https://en.wikipedia.org/wiki/United_Nations_geoscheme) for other species.

damianooldoni commented 4 years ago

@SanderDevisscher: it's a long recoding and it's just tedious to write it:

df %>%
  mutate(Azië = if_else(native_range == "Eurasia", TRUE, FALSE),
         Afrika = FALSE, #no TRUE values in table above
         Noord-Amerika = if_else(native_range == "North America", TRUE, FALSE),
         etc.
)

@timadriaens, should we do this mapping for TrIAS indicators report as well? Or it's something you want only for RShyny visualization webpage?

damianooldoni commented 4 years ago

@timadriaens : I ask again our TrIAS colleagues about possibility of mapping it at checklist publication level or while unifiying the checklists.

timadriaens commented 4 years ago

@damianooldoni I think very often, checklists will have more detailed (and not very standardized) information on native range. And it would be "a pitty" (and not very stimulating to the checklist owner to keep track of such detail) actually if this went lost at checklist publication level. I don't know what is the best place - and there is always the option of keeping a verbatim field - but it is us who have made the very pragmatic decision to use WGSRPD for plants and UN geoscheme for animals, these are very crude anyway.

timadriaens commented 4 years ago

So in the logic of preserving as much detail in the source checklists as possible, I would say it is more something for the flow to the unified. And of course, this should first of all be in the TrIAS indicators.

SanderDevisscher commented 4 years ago

@SanderDevisscher: it's a long recoding and it's just tedious to write it:

df %>%
  mutate(Azië = if_else(native_range == "Eurasia", TRUE, FALSE),
         Afrika = FALSE, #no TRUE values in table above
         Noord-Amerika = if_else(native_range == "North America", TRUE, FALSE),
         etc.
)

@timadriaens, should we do this mapping for TrIAS indicators report as well? Or it's something you want only for RShyny visualization webpage?

@damianooldoni what about grepl ? or a manual mapping list ?

damianooldoni commented 4 years ago

@SanderDevisscher : yes, a file containing the mapping seems way better.

peterdesmet commented 4 years ago

Is the UN Geocode scheme this one: https://en.wikipedia.org/wiki/United_Nations_geoscheme?

If we're going to standardize it anyway, why not use one scheme for all?

timadriaens commented 4 years ago

because the botanists wanted their own thing see this issue

timadriaens commented 4 years ago

there is of course also the issue of marine species where native range requires a dedicated vocab

peterdesmet commented 4 years ago

Regarding botanists: well, I think we (authors of the unified checklist) can choose to which extend we want to aggregate (including using the same scheme for plants and animals, marine taxa are indeed separate). The source data will always be available. So, it comes down to making a decision.

eadriaensen commented 4 years ago

@SanderDevisscher @timadriaens @damianooldoni I assume dealing with native_range is still ongoing?

Could you update whenever this is ready for me to continue working on the plot?

SanderDevisscher commented 4 years ago

@damianooldoni what is the status of the native range issue ?

timadriaens commented 4 years ago

values have largely been standardized by @peterdesmet see this issue but not sure this is already updated in the unified @damianooldoni ?

SanderDevisscher commented 4 years ago

@damianooldoni can you make sure the mapping find its way into data_input_checklist_indicators.tsv

peterdesmet commented 4 years ago

It has not been included in the unified checklist, because:

timadriaens commented 4 years ago

As said and argumented here, I definitely think they should be part of it, yes. Otherwise, breakdowns of indicators based on native ranges are impossible.

peterdesmet commented 4 years ago

This is now republished with standardized native range in the unified checklist. Can you reprocess and see if that solve the issue?

damianooldoni commented 4 years ago

Thanks @peterdesmet . I will asap process unified checklist data to include these changes.

eadriaensen commented 4 years ago

@SanderDevisscher @timadriaens @damianooldoni I imported the update of the checklist. So now there are 25 different values for native_range.

> table(exotenData[, "native_range"], exclude = NULL)

                   Africa                  Americas                      Asia 
                     1380                       211                       566 
Australia and New Zealand                 Caribbean           Central America 
                      192                         5                        25 
             Central Asia            Eastern Africa              Eastern Asia 
                       13                        24                      2549 
           Eastern Europe                    Europe                 Melanesia 
                     2587                      2511                         1 
               Micronesia             Middle Africa           Northern Africa 
                        2                        14                        23 
         Northern America             South America         Southeastern Asia 
                     1205                       553                        52 
          Southern Africa             Southern Asia           Southern Europe 
                       15                        75                        83 
           Western Africa              Western Asia            Western Europe 
                       19                        28                        18 
                     <NA> 
                      603 

Concerning the requested graph:

maar in de stijl van FIGUUR: Gerapporteerd aantal per jaar en per regio van de grofwildjacht pagina. Dus per jaar een stacked bar chart met de verschillende regios

Do I write the code for this graph? If so, be aware that the style might differ from the graphs that are currently used for aantal geïntroduceerde uitheemse soorten per jaar and Cumulatief aantal uitheemse soorten as they come from the TRiAS package. Or is there code available?

SanderDevisscher commented 4 years ago

Do I write the code for this graph?

Yes, if it were up to me!

If so, be aware that the style might differ from the graphs that are currently used for aantal geïntroduceerde uitheemse soorten per jaar and Cumulatief aantal uitheemse soorten as they come from the TRiAS package.

I think it is best we overwrite the TrIAS style with the Inbotheme style or at least use a similar style as the rest of the app.

Or is there code available?

Not that I know of. Shall I make a basis for you to start from or would you like to start from scratch ?

peterdesmet commented 4 years ago

Note: the regions can be grouped in 5 continents: https://en.wikipedia.org/wiki/United_Nations_geoscheme

eadriaensen commented 4 years ago

If so, be aware that the style might differ from the graphs that are currently used for aantal geïntroduceerde uitheemse soorten per jaar and Cumulatief aantal uitheemse soorten as they come from the TRiAS package.

I think it is best we overwrite the TrIAS style with the Inbotheme style or at least use a similar style as the rest of the app.

Okay, I will give that a go.

Or is there code available?

Not that I know of. Shall I make a basis for you to start from or would you like to start from scratch ?

Thanks, I have the code from grofwildjacht to start from. So it's okay

eadriaensen commented 4 years ago

Note: the regions can be grouped in 5 continents: https://en.wikipedia.org/wiki/United_Nations_geoscheme

Do you prefer to display on continent level (in that case I regroup)? Or on native_range level as in raw data?

SanderDevisscher commented 4 years ago

Note: the regions can be grouped in 5 continents: https://en.wikipedia.org/wiki/United_Nations_geoscheme

Do you prefer to display on continent level (in that case I regroup)? Or on native_range level as in raw data?

Maybe add a unit type filter or a switch (to increase details-level => default continent)

eadriaensen commented 4 years ago

First version of the plot:

exotenPerYearPerNative_continent

SanderDevisscher commented 4 years ago

First version of the plot:

exotenPerYearPerNative_continent

Ziet er goed uit maar wel lage aantallen, Is dit gefilterd ?

SanderDevisscher commented 4 years ago

@mvarewyck @eadriaensen is it possible to add an option to choose between the relative and absolute numbers for this graph ? Default should be relative.

Title should be "aandeel geïntroduceerde uitheemse soorten per jaar per regio van oorsprong" when relative and "aantal geïntroduceerde uitheemse soorten per jaar per regio van oorsprong" when absolute.

eadriaensen commented 4 years ago

It would be good to implement the code for this plot in the trias package, for sake of uniformity - and given the limited remaining project hours. In that case, the option to display relative or absolute numbers should ideally be implemented in the function there as an argument.

If so, I can easily implement a tick box for the user in the app to switch between relative and absolute

The current code, used for the graph : exoten_indicatorYearByNativeRange.zip

@SanderDevisscher is this possible?

SanderDevisscher commented 4 years ago

@damianooldoni can you look into this ?

damianooldoni commented 4 years ago

I agree with @eadriaensen; it is better to develop this functionality in the trias package for sake of uniformity.

But I go on holiday on Thursday. No time to do it this afternoon or tomorrow. As maintainer of the trias package I am more than happy to accept contributions and PRs. @SanderDevisscher or somebody else involved: maybe have you some time to give a try? I can review it when I come back (~2 weeks). And in the meanwhile you can use the trias package in development branch for the shiny app instead of the master? Thanks

SanderDevisscher commented 4 years ago

@damianooldoni I'll give it a go