Hybrid Algorithms - Githubissues

diegoesteves commented 8 years ago

We need to create a super class here...e.g.: SVM + LR

mommi84 commented 8 years ago

We should first discuss about the possibility to maintain a general algorithm taxonomy (see for instance, this one), but this would be a tedious task and the list will never cover everything. If we are not going to do this, then from a theoretical point of view, a hybrid algorithm is different from the parts which compose it, therefore I see no need for a superclass here. On the other hand, a good way to disambiguate algorithms is always to keep them linked to the reference paper.

joaquinvanschoren commented 8 years ago

I agree with Tommaso that we will never be able to keep a full trace of every possible algorithm, they are created faster than we can add them. And there are millions of problems when you want to 'classify' an algorithm: only few are 'textbook' algorithms, most of them have tiny details different from previous algorithms, and some are not classifiable at all because they are something else. It is a huge task to look into the internals and see whether an algorithm is of type A or B, and making many subclasses is only going to make that task harder. I would only model very general classes, e.g. if the input data is fundamentally different (a table, a process log, itemsets,...). All algorithms should just be individuals of those. I would even be reluctant to define these both on 'specification' and 'implementation' level, since this doubles the work. Ultimately, you can say a lot of useful things about algorithms implementations, but very little about their pseudocode specifications.

On Mon, Nov 2, 2015 at 5:49 PM Tommaso Soru notifications@github.com wrote:

We should first discuss about the possibility to maintain a general algorithm taxonomy (see for instance, this one https://s3.eu-central-1.amazonaws.com/tommaso-soru/files/MLMastery-ML-Algorithms-Mindmap.png), but this would be a tedious task and the list will never cover everything. If we are not going to do this, then from a theoretical point of view, a hybrid algorithm is different from the parts which compose it, therefore I see no need for a superclass here. On the other hand, a good way to disambiguate algorithms is always to keep them linked to the reference paper.

— Reply to this email directly or view it on GitHub https://github.com/ML-Schema/core/issues/8#issuecomment-153080087.

agnieszkalawrynowicz commented 8 years ago

I am in favor of having specification and implementation (especially that there are implementations with their own metadata in environments such as RapidMiner and Weka, and we even have them represented in our KB). But particular implementations (e.g. in well-known tools) constitute a knowledge base in my view, and not a schema. I do not see any problem in that it is hard to model all that have been proposed or implemented. We do not have to:-)

ML-Schema / core

Hybrid Algorithms #8