r-spatial / classInt

Choose Univariate Class Intervals
https://r-spatial.github.io/classInt/
33 stars 9 forks source link

Define new automatic classes when the number of rows in dataframe in loop is 1 #48

Closed ar-onos closed 2 months ago

ar-onos commented 2 months ago

Hi there. I have a list of several dataframes with some having only one row. Ideally, I want 'n' in classIntervals to be 5. Is there a way to force a particular class label e.g. 3 if there is only one row or 1,3, and 5 if there are only 3 rows? I use fisher as the method.

rsbivand commented 2 months ago

classInt::classIntervals is not automatic, so the user must take responsibility for setting arguments sensibly. Using the function on too short vectors is meaningless anyway. You have not provided a minimal reproducible example to motivate your question. Provide a more complete context and example.

ar-onos commented 2 months ago

Hi @rsbivand, I have to run this loop over a list of 1000 dataframes. I need to be able to use it for dataframes that have only 1 or 2 rows in the loop. Here is a reproducible example:

  library(classInt)
  library(rlist)
  library(dplyr)

 ##Create dataframe 
 Country <- c('Australia', 'Italy', 'Peru', 'China','Australia', 'Italy', 'Peru', 
 'China','Australia', 'Italy', 'Peru', 'China','Nigeria','Australia', 'Italy', 'Peru', 
 'China', 'Indonesia')
 Time <- c(21, 18, 17, 10,10,15,27,0,2,4,5,7,4,8,9,10,5,2,4)
 Area <- c("A","A","A","A","B","B","B","B","C","C","C","C","D","D","D","D","D","C","B")
 DF  <- data.frame(Country, Time, Area)

This should produce this dataframe:

       Country Time Area
  1  Australia   21    A
  2      Italy   18    A
  3       Peru   17    A
  4      China   10    A
  5  Australia   10    B
  6      Italy   15    B
  7       Peru   27    B
  8      China    0    B
  9  Australia    2    C
  10     Italy    4    C
  11      Peru    5    C
  12     China    7    C
  13   Nigeria    4    D
  14 Australia    8    D
  15     Italy    9    D
  16      Peru   10    D
  17     China    5    D
  18 Indonesia    2    C
  19 Indonesia    4    D

 ## Split by Country
 NewXL <- split(DF,DF$Country)

 ## Generate the ranges and category/classes for each country
 NewXL2 <- list()
 for (i in 1:length(NewXL)) { AB <- NewXL[[i]]
 #Create condition:
 skip_to_next <- FALSE
 tryCatch(Classes <- classIntervals(AB$Time, n=3, 
 cutlabels=F,style='fisher',factor=F,warnSmallN=F,warnLargeN=F), error = function(e) { 
 skip_to_next <<- TRUE})
 if(skip_to_next) { next } 
 ## Classify
 # Range and Class for each Absolute population exposed
 AB$Range_Abs <- classify_intervals(AB$Time, 3, "fisher", factor = T)
 AB$Class_Abs <- classify_intervals(AB$Time,3, "fisher", factor = FALSE)

 NewXL2[[i]] <-AB }
rsbivand commented 2 months ago

"I have to" is not an argued motivation. The function is not designed for this case, so your motivation must be fully explained. Time is not in a time class, and assigning to the global environment with <<- should never be used. You must be able to branch on the number of rows in a data.frame for degenerate cases. This was in any case not a reproducible example.

ar-onos commented 2 months ago

"I have to" is not an argued motivation. The function is not designed for this case, so your motivation must be fully explained. Time is not in a time class, and assigning to the global environment with <<- should never be used. You must be able to branch on the number of rows in a data.frame for degenerate cases. This was in any case not a reproducible example.

Apologies for the "I have to" as it wasnt used in the context of a motivation. Time is just a column name and not time in the real sense. It is simply a numeric variable and could be population. The loop ran without issues until recently and it would just skip the dataframes with single rows with issues. Thanks for responding. Someone provided a way around it on stackoverflow that I'm exploring and modifying too.

rsbivand commented 2 months ago

Add the stackoverflow link for completeness, I personally dislike answering questions where the questioner doesn't wait for a reply but goes to untrusted sources.

ar-onos commented 2 months ago

I actually posted on both around the same time. Apologies for assuming the issue would be clear without adding too many details about what the issue was. I am adding this link for anyone who may have such issues in future. https://stackoverflow.com/questions/78703839/adding-conditions-to-loop-to-generate-class-intervals-using-classint-in-r-for-ar