hyunsooseol / snowCluster

This module allows users to analyze k-means & hierarchical clustering, and visualize results of Principal Component, Correspondence Analysis, Discriminant analysis, Decision tree, Multidimensional scaling, Multiple Factor Analysis, Machine learning, and Prophet analysis.
http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
8 stars 2 forks source link

LDA in SnowCluster not showing the results #13

Closed AireinAlbania closed 8 months ago

AireinAlbania commented 1 year ago

Hi!

I am using snowCluster for linear discriminant analysis but the results are not loading. I am using MacOS and installed the jamovi latest version.

Screenshot 2022-12-13 at 2 01 58 PM
hyunsooseol commented 1 year ago

Hi AireinAlbania

Have you tried example file(Multivariate analysis>iris2) ? if can not loading, Could you attach the .omv file as zip file or send it to my e-mail: snow@cau.ac.kr to see what happens? Best Seol

AireinAlbania commented 1 year ago

Yes, I did use the iris dataset but the results are also not showing. Attached is the dataset that we are currently using in class. Some of my classmates who are using MacOS have the same concern. Only one of us was able to run the analysis.

Thank you for your immediate response.

StudentsPerformance.csv

hyunsooseol commented 1 year ago

Hi AireinAlbania

It works in win 10 and attached .omv file as zip here. StudentsPerformance (1).zip Hmm. . . I will ask it to jonathon.

AireinAlbania commented 1 year ago

Thank you so much!

Looking forward in addressing this concern.

hyunsooseol commented 1 year ago

Hi AireinAlbania

Jonathon will take a look tommorrow. Anyway, Could you tell me what happens with Win OS ?

AireinAlbania commented 1 year ago

Unfortunately, I don't have any equipment that uses Win OS. But I was able to open the file you sent and saw the results. I run the analysis again using that file but, again, the results are not loading.

hyunsooseol commented 1 year ago

Hi AireinAlbania

Are you using latest version of snowCluster module in the jamovi library? The latest version of snowCluster module is 6.8.0. Just check it out.

Best Regards Seol

AireinAlbania commented 1 year ago

Yes, the version I am using is updated.

Thanks!

hyunsooseol commented 1 year ago

Hi Hi AireinAlbania

Is it working or not with latest version of snowCluster module?

AireinAlbania commented 1 year ago

Hello Seol,

Yes, the result are not loading the latest version. I am not sure if it is only on MacOS, but one of my classmates who uses MacOS was able to run it successfully.

Thanks!

hyunsooseol commented 1 year ago

Hi AireinAlbania

In this case, the problem might be due to specific environment of your MacOS, which is beyond of my ability.

Best Regards Seol

AireinAlbania commented 1 year ago

Thank you very much for accommodating my questions. I will try to find out why this happened.

vjalby commented 10 months ago

Hi,

I have the same problem using LDA on my mac, with the Iris sample files provided with your plugin. More precisely, it works with 2 covariates but fails (spinning arrows) when I use 3 or more.

I downloaded your module source, and built in on my mac with the same problem. After hours (!) of debugging, the problem seems to be related to disc.b.R/.plot2 part. Actually not the function itself (the problem remains if I add return(FALSE) at the first line of the function, but when you set the image state at the end of the .run() function with (lines 328-330) :

      state <- list(formula, train) 
      image2$setState(state)

I cannot figure why it makes the .run() function to fail (spinning arrow, no error/debug message) and only with more than 2 variables... Maybe state cannot be a list ?

My workaround was to replace this with anything else, eg,

image2$setState(df)

then to modify the .plot2() function to rebuild the data from scratch:

      data <- self$data
      data <- jmvcore::naOmit(data)
      per <- self$options$per
      formula <- jmvcore::constructFormula(self$options$dep, self$options$covs)
      formula <- as.formula(formula)
      split1<- caret::createDataPartition(data[[self$options$dep]], p=per,list = F)
      train <-data[split1,]

instead of using

      formula <- image2$state[[1]]
      train <- image2$state[[2]]

Maybe my workaround is not the best, but it makes the iris sample work on mac. I hope this will help to fix your module so it can work on mac too :) Feel free to contact me if you need a mac tester !

Not related to previous pb, but it seems your snowCluster/LDA requires a dependent variable with at least 3 values (it doesn't work with dichotomous dependent variable) while MASS::lda function doesn't (it works with dichotomous dependent variable).

Regards,

Vincent

hyunsooseol commented 10 months ago

Hi vjalby

Are you using latest version of jamovi and snowCluster module? Please check it first. or try it with another mac computer to see what happens.

LDA is working well on WIN 10 now.

Best Seol

vjalby commented 10 months ago

i'm using Jamovi 2.4.8 (latest available for mac) and snowCluster 7.2.5. Same problem on my laptop (M2 Macbook)

hyunsooseol commented 10 months ago

hi Hi vjalby

LDA is working well on WIN 10 shown below. Please state the problems to jamovi Forum directly. Jonathon love is responsible for pushing the modules into jamovi library.

image Best Seol

vjalby commented 10 months ago

done : https://forum.jamovi.org/viewtopic.php?t=3646

jonathon-love commented 10 months ago

hi @vjalby,

thanks for your efforts here! this is some great investigating you've done. i think the issue might be that the state object is too large.

see here:

https://dev.jamovi.org/tuts0203-state.html#setstate()

cheers

hyunsooseol commented 10 months ago

Hi @jonathon-love

Could you check LDA analysis within snowCluster module is working or not on Mac with iris example data ? cause I can not access the Mac computer now.

Best Regards Seol

jonathon-love commented 10 months ago

yes, i can confirm that two works, but three or more doesn't -- consistent with the hypothesis that the state object is too large.

hyunsooseol commented 10 months ago

@jonathon-love but it works on WIN 10 with more than 3 covariates as shown below now. I am just wondering why Mac does not working with more than 3 covariates.

image

jonathon-love commented 10 months ago

my hypothesis is that the size of the state objects differ between macOS and windows -- of course, why it differs i don't know. that might be something deep in the bowels of R.

of course, this is all just a hypothesis at this point ... we need to confirm that the size of those state objects differ ... but i'm reasonably confident that's it.

basically, storing non-basic R objects in state is very problematic, because serialization in R often takes up huge amounts of space.

hyunsooseol commented 10 months ago

@jonathon-love Thanks for your kind reply. To fix it on Mac, how do I that to fix it? This is a first time that encountered these kinds of problems due to the difference between WIN and Mac.

Best Regards Seol

jonathon-love commented 10 months ago

let's wait for @vjalby to confirm the size of the state object. are you able to do that for us @vjalby?

but basically @vjalby's solution, to build the train object in the plot function is (probably) the way to handle this.

one thing i'd add, is that if you are accessing self$data from within the plot function, you'll need to add a requiresData: true into the .r.yaml

https://github.com/jamovi/jmv/blob/master/jamovi/corrmatrix.r.yaml#L168-L169

basically, we try and provide summary data so that plots can be generated without having to load the whole data set (think about the scenario where someone wants to save an image ... we want to regenerate that image with rerunning the whole analysis) ... that's what $setState() let's us do.

but sometimes that summary data is so large, we can't store it in the state system, and the analysis can only generate the plot by loading the whole data set ... which i think is your situation.

hyunsooseol commented 10 months ago

@jonathon-love Thanks for your kind solutions. I am waiting @vjalby's reply and will try to fix it. Thanks always~ ^^

hyunsooseol commented 10 months ago

Hi @vjalby and @jonathon-love I have resolved the 2 issues raised above and uploaded it to the github. Could you check it whether it works on MAC or not? @jonathon-love If it's OK, Could you push it to jamovi library?

Best Regards Seol

vjalby commented 10 months ago

Hi @jonathon-love and @hyunsooseol

Thanks for your quick replies !

According to my tests with length(serialize(state, connection=NULL)), the size of the state is 3,107,600 with 2 covariates (analysis is working) and 3,236,849 with 3 covariates (analysis fails) and 3,364,534 with 4 covariates (fails too). Within R, the iris df is only 5,798 !

hyunsooseol commented 10 months ago

Hi @vjalby I uploaded the fixed codes to github. Could you test it with MAC now?

vjalby commented 10 months ago

@hyunsooseol i tested your new version and it fixes the bug on mac.

two more comments :

hyunsooseol commented 10 months ago

@vjalby Thank you very much for your kind comments.

With regard to levels of dependent var are 2, when the dependent levels are only 2, MASS::lda() does not provide the results of 'Proportion of trace' and 'LD plot'. but everything else are working well. That's why I created warning options for two options.

For the setting the split factor to 1, I will modify the default value to 1 as you suggested.

If you have any other fixes or improvements, please let me know. Thanks again.

Best Regards Seol

MAgojam commented 10 months ago

Hey, two lines just to add that also in Ubuntu (like for macOs), with two covariates it works, but with the inclusion of the third it fails. Technically serialization is simply a step in marshalling an object, which is completed with winOS, but not with the other 2. The dimensions of the objects do not differ particularly from each other and in comparison I can say that in win they are slightly larger. Thus, regardless of having found, at the moment, a possible adjustment, the problem remains open.

Cheers, Maurizio

hyunsooseol commented 10 months ago

@MAgojam Thanks for your kind comments.

If you have a time, Could you test it with more than 3 variables on MAC with the following updated github source; https://github.com/hyunsooseol/snowCluster.git

I hope it works with more than 3 covs. on MAC. Thanks in advance Seol

hyunsooseol commented 10 months ago

@vjalby Thanks for indicating the bug on Mac. the updated version is available in jamovi library now.

image

Best Regards Seol