JuliaDynamics / ComplexityMeasures.jl

Estimators for probabilities, entropies, and other complexity measures derived from data in the context of nonlinear dynamics and complex systems
MIT License
55 stars 13 forks source link

"Pick one word per concept" is violated in "symbolizing" stuff #140

Closed Datseris closed 1 year ago

Datseris commented 1 year ago

Hm, I am thinking that we violate the "pick one word per concept" principle. We mix things: alphabet, length, symbols, words, letters, state-space properties...

I think we should decide on one word for the concept "symbol" or "letter" or whatever. While "symbol" may be sometimes confused with Base.Symbol, I guess it is the best option. Alphabet stuff can get confusing, because then you'd expect a symbol of the alphabet to be a letter, or whatever. Then again, we may use the word "event", which doesn't conflict with base and is general enough. Actually, we already use "event" in probabilities_and_events (which by the way we haven't actually listed in the documentation).

Okay, so here is what I propose:

What do you think?

p.s.: The pick-one-word-per concept idea isn't originally mine, see "A Handbook of Agile Software Craftmanship"

kahaaga commented 1 year ago

In summary, I agree that we should use established terminology, and I like the idea of using event, because it's already an established term in the context of "probability spaces". Once we're pursuing a more meaningful terminology, we should stick with existing probability-related terminology, i.e. from some reference textbook.

I have a quite lengthy comment/suggestion that I started writing here, but I think it is best summarised as a documentation page outlining our choice of terminology. I'll summarize my suggestions as a PR (not making any actual code changes, just including a new documentation page with some terminology rationale), and we can discuss from there.

kahaaga commented 1 year ago

It is also important that we reach an agreement on this before submitting the JOSS paper.

Datseris commented 1 year ago

offtopic but I don't think we should go for JOSS, we should go for Chaos. Inferior entropy software has been published in Chaos so we most definitely can publish this there. Perhaps do some cross evaluation of some methods on some aspect or whatever.

Datseris commented 1 year ago

we can e.g., use this new "missing patterns surrogates" for all methods and see which performs the best. It sounds like new research and it would take us a day or two.

kahaaga commented 1 year ago

offtopic but I don't think we should go for JOSS, we should go for Chaos. Inferior entropy software has been published in Chaos so we most definitely can publish this there. Perhaps do some cross evaluation of some methods on some aspect or whatever. we can e.g., use this new "missing patterns surrogates" for all methods and see which performs the best. It sounds like new research and it would take us a day or two.

Yeah, I think Chaos is a good choice. With the machinery we've releasing for v2.0, there potential for a lot of new research. So we could potentially frame the paper as a "tool for new research in probabilities and information theoretic methods" (or something cheesy like that), and just supplement with a few use cases, like the missing patterns.

kahaaga commented 1 year ago

@Datseris Ok, I've implemented some suggested changes in pr #141.

In summary:

With these points, I think I've addressed all potential issues I would point out as a reviewer of the software. But then again, much of this comes down to preference.

What do you think?

kahaaga commented 1 year ago

Closed with #141