simogeo / geostats

A tiny and standalone javascript library for classification and basic statistics :
https://www.intermezzo-coop.eu/mapping/geostats/
207 stars 64 forks source link

Show actual class limits #10

Closed tmiosmauli closed 11 years ago

tmiosmauli commented 11 years ago

Currently the upper class limit is the same as the lower limit of the next class so it is unclear to which class a value belongs to, if it's on the border. You can see this in the example 9 legend: http://www.empreinte-urbaine.eu/mapping/geostats/

It would be better if the class limits would show the actual values of the data (i.e. what is the maximum value in the data set belonging to that particular class and what is the minimum value in the data set belonging to the next class). This could be an option in the getHtmlLegend()-method, if both alternatives are desired.

Thanks for a nice lib!

simogeo commented 11 years ago

It would be better if the class limits would show the actual values of the data (i.e. what is the maximum value in the data set belonging to that particular class and what is the minimum value in the data set belonging to the next class). This could be an option in the getHtmlLegend()-method, if both alternatives are desired.

:-1: Not at all, this would break the whole classification. Values have to be continuous.

Actually, these values :

 29 ⇔ 379 (10)
 379 ⇔ 2762 (9)
 2762 ⇔ 6885 (9)

should be transformed to :

 29 ⇔ 379 (10)
 380 ⇔ 2762 (9)
 2763 ⇔ 6885 (9)

The sample above is quite easy. But it becomes a bit tricky with float values :

 29.260507 - 378.801557
 378.801557 - 2762
 2762 - 6884.84825

Decimal precision could be entered by the user and then a transformation can be applied. I will think about it but when working on the library, I took example from GIS software such as QGIS which displays boundaries as well.

tmiosmauli commented 11 years ago

The book Thematic Cartography and Geovisualization by Terry A Slocum et al. describes three approaches for specifying class limits: 1) indicating the range of data actually falling in each class (produces gaps between between classes). Like so: 29.260507 - 378.801557

420.000001 - 2762 3029.808861 - 6884.84825 7434.30475 - 22668.854812 24350.762236 - 258524.672469

2) eliminating the gaps by expanding classes (this would be like your example, and yes with float values it is a bit tricky, but luckily only a bit)

3) indicating the minimum and maximum data values and the upper limit of each class In this approach the numbers are located between the class colors in the legend.

The currently used method in geostats is not even described in the book, although it is also used elsewhere (e.g. QGIS, like you mentioned).

IMHO it'd be best to have many different ways of showing the limits. They all have their strengths and weaknesses.

simogeo commented 11 years ago

The currently used method in geostats is not even described in the book, although it is also used elsewhere (e.g. QGIS, like you mentioned).

Actually, these "Standards uses" by most of GIS softwares comforted me on my laziness when writing geostats.js. I have to admit that first, I was thinking implementing the method 2) described by Terry A Slocum et al.

If you want to contribute on implementing method 1), method 2) or both on geostats lib, you're most welcome. My only recommendation will be to make this possible as option.

Please let me know if you want to contribute (and thanks for your reflexion)

tmiosmauli commented 11 years ago

The third method shows the upper limit for each class, meaning that the values are not like in a traditional legend but rather "in between" the classes. So not next to the colored boxes but in between, for a four class map there are five numbers (min, max, upper limit of the 1st class, upper limit of the 2nd class and upper limit of the 3rd class).

I'm sorry but my JS skills are rather limited. The resulting code is far from pretty :)

And yes, I agree the methods 1 and 2 should be optional (in addition to the current way of doing things). The third one... well if someone wants to do it.

simogeo commented 11 years ago

Method 2 has been implemented, see commit 1c9aafc4cf061261918fdee2c69a9c21ff47ea9f

Any feedback welcome

tmiosmauli commented 11 years ago

This looks very nice, thank you!

I have one question: When calculating the start of next class should it be checked whether the next value has the same amount of decimals? In other words, would the class limit values [0.5, 1, 1.5, 1.51] generate a legend like below?

0.5 - 1 2 - 1.5 1.6 - 1.51

This might be a stupid question and later I could just test it, but now I don't have the time.

simogeo commented 11 years ago

Of course, not. And that was a big point to deal with.

Since the last commit, decimal precision is taken into account.

When giving a serie to geostats, the lib will now automatically determine the decimal precision according to the max decimal precision contained in the serie.

For example, the serie [0.152 , 2 , 2.3, 4.52, 5, 8.1] will be converted to [0.152 , 2.000 , 2.300, 4.520, 5.000, 8.100].

This can be overwritten by the developper by using setPrecision() method. For example, setPrecision(0) will convert all numbers to integers and setPrecision(2) will convert numbers serie to 2 decimals only.

On your example, the displayed legend will be :

  0.5 - 1.0
  1.1 - 1.5
  1.6 - 2.4

And setting precision to 3 will display :

  0.500 - 1.000
  1.101 - 1.500
  1.501 - 2.400

Note : the setPrecision() method not only affect the legend but also the classification

simogeo commented 11 years ago

And discontinuous legend (method 1) is implemented as well. See 80b4f645e20a5e06efad87e4f4a20d744cbb440c

thanks.