r-spatial / classInt

Choose Univariate Class Intervals
https://r-spatial.github.io/classInt/
33 stars 8 forks source link

address #44 #45

Closed rsbivand closed 1 year ago

rsbivand commented 1 year ago

@duccioa This is my attempt to sort out the logic, but I'm unsure whether QGIS always uses 3000 when N > 3000, or how the sampling proportion is applied. I'll carry on looking at the QGIS documentation. I'd be grateful for your comments.

duccioa commented 1 year ago

@rsbivand to me the implementation looks good. The lines

if (warnLargeN &&
      (style %in% c("kmeans", "hclust", "bclust", "fisher", "jenks"))) {
      if (nobs > largeN) {
        warning("N is large, and some styles will run very slowly; sampling imposed")
        sampling <- TRUE
# issue 44
        nsamp <- as.integer(ceiling(samp_prop*nobs))
        if (nsamp > largeN) nsamp <- as.integer(largeN)
      }
    }

seems to imply that sampling is used in algorithms other than fisher and jenks, but I was not able to find trace of it because nsamp is used only in the fisher and jenks implementations. Am I wrong ?

rsbivand commented 1 year ago

@duccioa You are quite right (again), I'll try to introduce nsamp to those styles too.

rsbivand commented 1 year ago

@duccioa The help page was right, not the code. Adding break finding from a sample in the three other styles is not very obvious, so I just removed those styles from the warning.