Closed catterbu closed 8 years ago
@catterbu: how are those unit tests for boolean data working out? let me know when this seems fully tested and ready to merge.
@nsh87 Yeah, pretty much done. Had a weird last couple of days, but I think that I have everything figured out. I just have to dirty that data set I created, integrate the changes you just added via rebase, and then make sure that everything is good.
@catterbu: this isn't correct anymore, is it? It should say that it's chosen by majority rule using a bunch of different algorithms, right?
@nsh87 the documentation for the multiClust object appears to have updated properly. Let me know if I missed something. Chris
On Mar 14, 2016, at 9:03 AM, Nikhil Haas notifications@github.com wrote:
@catterbu: this isn't correct anymore, is it? It should say that it's chosen by majority rule using a bunch of different algorithms, right?
— Reply to this email directly or view it on GitHub.
@catterbu: Not sure what this is referring to: "@nsh87 the documentation for the multiClust object appears to have updated properly. Let me know if I missed something."
@nsh87 you had a comment about the docs needing to be updated. Maybe I misinterpreted it on my phone.
On Mar 14, 2016, at 9:42 AM, Nikhil Haas notifications@github.com wrote:
@catterbu: Not sure what this is referring to: "@nsh87 the documentation for the multiClust object appears to have updated properly. Let me know if I missed something."
— Reply to this email directly or view it on GitHub.
@catterbu: you had documented multiClust
in two different functions (basically copying the documentation in one to the other). You updated one of them, but not the other. So rather than deal with this again when COMMUNAL is used, I'm just creating an actual class where the documentation for multiClust
can live and it can just be referenced from whatever functions you want. I'll have to update some of the code since referencing slots in a class is done with @
, not the typical indexing with [ ]
.
@nsh87 got it.
On Mar 14, 2016, at 9:57 AM, Nikhil Haas notifications@github.com wrote:
@catterbu: you had documented multiClust in two different functions (basically copying the documentation in one to the other). You updated one of them, but not the other. So rather than deal with this again when COMMUNAL is used, I'm just creating an actually class where the documentation for multiClust can live and it can just be referenced from whatever functions you want. I'll have to update some of the code since referencing slots in a class is done with @, not the typical indexing with [ ].
— Reply to this email directly or view it on GitHub.
@catterbu: maybe I missed it, but did you generate a new multiClust
object f_clust.rda
for use in the unit tests, or is it still the old one?
@nsh87 No! I forgot to look up how to do that.
On Mar 14, 2016, at 10:43 AM, Nikhil Haas notifications@github.com wrote:
@catterbu: maybe I missed it, but did you generate a new multiClust object f_clust.rda for use in the unit tests?
— Reply to this email directly or view it on GitHub.
@catterbu: it's just save(object_to_save, file="filename.rda")
,then move it where you want to.
This pull request adds a new method for determining the best number of clusters in
multi_clust
to use for a given data set. Specifically,validate_num_data
now returns a boolean variable indicating whether or not the data set fed into it is boolean or not. Previously, it simply sent a message to the screen when this was the case.NbClust
function was brought in from the NbClust package, based on the GNU public license, and adapted to be more robust in cases where not all algorithms can be used.multi_clust
calls theNbClust
function to determine the ideal number of clusters for a data set, labeledk_best
in the multiClust object.multi_clust
leverages the new variable returned byvalidate_num_data
to callNbClust
differently if a binary data set is being used.