kwikteam / klustakwik2

Fast software for high-dimensional cluster analysis using the masked EM algorithm for Gaussians mixtures
BSD 3-Clause "New" or "Revised" License
31 stars 13 forks source link

different input and output filenames #57

Closed nsteinme closed 9 years ago

nsteinme commented 9 years ago

Currently, specifying to use name.fet.1 and name.fmask.1 produces name.clu.1 and name.klg.1. Instead, should be able to specify that the algorithm will work with the data in name.fet.1 and name.fmask.1 but produce name2.clu.1 and name2.klg.1. This would avoid the need to duplicate either the fet/fmask or the clu/klg when running a second time on a given dataset.

thesamovar commented 9 years ago

This isn't too much effort to implement, but I think we'll probably soon switch to primarily using phy on kwik files rather than fet/fmask/clu files. @rossant @nippoo ?

rossant commented 9 years ago

@nsteinme how would you want it to work exactly with phy?

nippoo commented 9 years ago

Yep, this is definitely an issue for phy! Thankfully the KWIK format already supports multiple clusterings within the same file, so this could just be from the command-line script:

phy cluster-auto --clustering-name=kk_test_1 myexperiment.kwik or whatever you want to call it.

In fact I think the only change we need to make, is saving the KlustaKwik parameters with the clustering in a notionally read-only way, so you can just create loads of clusterings with a numerical UUID, and distinguish between them using the parameters (i.e. you could imagine a dataset with 100 different clusterings or so, for 3-4 different values of half a dozen different parameters - having loads of .clus would just get really confusing)

thesamovar commented 9 years ago

@nsteinme would you be happier using fet/fmask/clu or would @nippoo's suggestion work for you?

nippoo commented 9 years ago

Or a similar argument to the clustering.run() function - default options run(algorithm='klustakwik2', clustering='main')

nsteinme commented 9 years ago

I'm definitely on board with the kwik version of this, and the way that Max described it using the multiple clusterings is the way I was thinking of it. If we're ready to go with that in the short term, then great, we can forget this issue and forget the fet/fmask.

On Wed, Jun 10, 2015 at 2:40 PM, Dan Goodman notifications@github.com wrote:

@nsteinme https://github.com/nsteinme would you be happier using fet/fmask/clu or would @nippoo https://github.com/nippoo's suggestion work for you?

— Reply to this email directly or view it on GitHub https://github.com/kwikteam/klustakwik2/issues/57#issuecomment-110757710 .

thesamovar commented 9 years ago

OK, sounds like a plan. Closing this issue.

rossant commented 9 years ago

see https://github.com/kwikteam/phy/issues/333