Open Louis-Backstrom opened 6 years ago
Yeah, I really like this idea of highlighting poorly sampled areas in Brisbane. What do you think about if we just applied the same rules for identifying "grids cells that are poorly sampled by checklists" (i.e. grey cells in "All year" map) and "grid cells that are poorly sampled by records" (i.e. grey cells in "Detections" map) instead of showing these new metrics? I worry that having multiple schemes/metrics for saying which areas are poorly sampled might confuse some people. In terms of fitting this into the atlas, what if we created a new chapter called "Brisbane city", then put the "Brisbane's environment" chapter as a section into the "Brisbane city" chapter, and also put this new "Brisbane's checklists" (or better name?) section under the "Brisbane city" chapter too.
Also, I like this idea of monitoring the "health"/"quality" of the data. Maybe we could add some time-series graphs to display changes in grid cell sampling coverage over time?
I think we should be wary of overcomplicating things for sure, but I reckon having a bit more detail than just checklists/records could be helpful too. Perhaps we could have that on the "front end" and then something more in-depth (like my set of criteria or similar) on the back-end, which then gets displayed as the health / time graph you suggested?
This is a brilliant idea, and actually complements the survey sheet functionality, e.g. the survey sheets could be downloable as a pop-up link when clicking on each grid square on these maps. I agree the metrics needs to be simple, and beyond that they should closely reflect what we are using on the main species accounts, as there needs to be a strong logical connection between how we actually calculate and display data in the species accounts, and how we measure the health of sampling in each grid squares. I'll have a think about these.
With everything else that needs to be done prior to launch (writing descriptions for surveyor sheets, drafting accounts, adding photos etc) I think we should hibernate this for a bit
Sounds good. Also - I really like the "hibernated" tag - it conveys that idea of "we can work on this later", and it has biological connotations too!
I was thinking it might be a worthwhile thing to have a page that shows a map of all the grid squares (or more realistically just the land ones) shaded by how "complete" the data is for that square as of the latest update - a bit more detail than the binary yes/no shading that currently exists for the species maps and hopefully useful for showing people where and what data is still needed. Squares could be graded as "no data", "limited data" and "sufficient data" or similar.
I'm not sure what data we necessarily want, but I came up with a sort of weighted criteria set that we could apply to each cell to determine whether the data is "acceptable" or not - this is sort of what I was thinking:
Criteria: Complete Records - 25% Weighting
Criteria: Seasonal Records - 10% Weighting
Criteria: Sampling Events - 15% Weighting
Criteria: Total Time - 15% Weighting
Criteria: Seasonal Time - 5% Weighting
Criteria: Time of Day - 20% Weighting
Criteria: Number of Observers - 10% Weighting
Obviously the weighting, required values and criteria themselves are all subject to change to come up with a complete set of criteria that is fair and reflects the kind of data we want in the atlas. Theoretically, all these values should be relatively straightforward to automatically calculate as all the requisite data should be in the eBird database. The only other criteria ideas I had were number of sites (say >5, so there's not just one big hotspot that gobbles everyone up) and number of species recorded (but I think this implies that there's a set number of species that must be recorded, which isn't necessarily valid).
Hope this makes sense - doesn't necessarily need to be implemented but I thought something like this would be useful for measuring the "health" of the dataset.