Closed obi-bot closed 7 years ago
Logged In: YES user_id=1238940 Originator: NO
accepted for Objectives: 1.normalization 2.averaging 3.partitioning
rejected: 4.non supervised classification 5.supervised classification Comment: Classification would definitely fall under objective however supervised/non-supervised are qualifier attached to the method/plan/algorithm used to achieve classification
more information needed: apart from the missing definition, would you require a parent term 'calculation'? if so it seems that all data transformation involve some level of calculation. 6.center calculation 7.spread calculation 8.moment calculation
Original comment by: proccaserra
Logged In: YES user_id=1155048 Originator: YES
Regarding points 4 and 5, here are some clarifications. Supervised and unsupervised classifications are meant to represent quite different types of objectives. Even if in each case objects get somehow classified as a result (hence the name 'classification' is used in both), the main objective of a data transformation which does supervised classification is that of *building a predictor*. The predictor can then be used to classify novel (future) data, but typically the end product of a supervised classification method is the predictor itself. A data transformation that performs unsupervised classification has as main objective that of *organizing the (existing) input objects into classes*. Initially we had thought to have a parent named 'classification' with the 2 children supervised and unsupervised, but it seemed harder to give a clean definition of 'classification' and one of the qualifiers 'supervised' and 'unsupervised' in such a way that the essence of supervised and unsupervised classification would be captured fully and accurately. That's why we opted for directly defining carefully these two terms and if you look at the proposed definitions they are specifying the difference in objectives explained above. I'm not clear on your question on a 'calculation' parent term. For children of 'data transformation' their names typically involve terms such as 'calculation' or 'transformation' to denote the fact that a data transformation is performing an action (see recent correspondence with Frank Gibson and DT branch email list on this topics), but here we are trying to propose definitions for terms which represent *objectives* of data transformations, not data transformations themselves.
Original comment by: manduchi
Logged In: YES user_id=138355 Originator: NO
If the objective of the supervised learning objective is to build a predictor, then why not call the objective "build a predictor"? What's the difference between averaging and center calculation? Some unsupervised learnings are partitions? Does it make sense to merge them?
The motivation for adding "objectives" was to eliminate asserted multiple inheritence. Would it be possible to quickly say how each objective here achieves that? (e.g. sufficient would be a list of transformations each of which can achieve at least two objectives).
Original comment by: alanruttenberg
Logged In: YES user_id=1238940 Originator: NO
Thanks Elisabetta for the addition. I have a similar comment to the one made by Alan. In terms of main outcome/end product/objective, can we state the following ? supervised classification has_objective (predictor building) unsupervised classification has_objective (classification)
Then regarding my question about 'calculation', I am just uneasy with having 'calculation' showing in several areas of the current OWL files (both under objective and under data transformation). It could be that as for the role branch, we would have to append the objective suffix to disambiguate. But this would not entirely solve the problem as we would need clarification of usage and also come up with prefered terms and label that would not necessarily contain the suffix (e.g for text mining, tagging purposes).
cheers
P
Original comment by: proccaserra
Logged In: YES user_id=1155048 Originator: YES
Hi, you are both making good points which are useful for brainstorming.
1. Regarding supervised classification, its main objective is to build a predictor to attach to novel objects class labels from a set of known labels. That is, it is a special type of predictor. In principle there might be predictors which accomplish a different sort of task, such predicting the value of a certain response variable based on the values of an independent variable (e.g. based on some curve fitting method). In other words, the term 'classification' is in there for a reason, because at the end one is enabled to attach class labels: the supervised approach aims at building a predictor for known class labels, the unsupervised one aims at organizing existing objects into classes where these are not known a priori. As I mentioned, originally we thought to place a parent 'classification' above the supervised and the unsupervised one, and we can still do that if it is deemed necessary or better. It was just not that simple to give a per se definition of classification and of the two qualifiers 'supervised' and 'unsupervised' combining which one could obtain clear and accurate definitions of the two children. But we can try to look into that again (e.g. thinking class discovery for unsupervised classification and class prediction for supervised one: this was the original thought but when coming to the nitty gritty of the defs it became harder to implement). Or we could proceed by giving the defs for the two children as proposed in this thread and to define a parent 'classification' simply by inference as the union of the two.
2. Regarding partitions, as per editorNote 2 in the def of unsupervised classification we wanted to encompass not only methods which would partition the input set, but also 'fuzzy' methods where you can have class overlap or more precisely where you attach to each object and each class a probability that the object is in that class. Maybe we can make this more explicit in the definition. We do make this explicit in the def of supervised classification: to emphasize the fact that this might yield a partitioning of objects or it might be fuzzy, we say that it could be deterministic or probabilistic. On the other hand partitioning can occur when your objective is not quite that of discovering or predicting classes, but in simpler situations. For example when you have some simple filtering going on which will divide your initial input into 'stuff to keep' and 'stuff to discard' stuff. This organizes your input into two classes, but these are a priori known (unlike the case with unsupervised classification) and you are not really using a machine learning technique to achieve this (like you do in supervised classification) In brief, there is overlap between instance of these classes but not clear containments of one class into another, so how to best merge is not clear to me.
3. Regarding averaging and center calculation. I'm not sure if Monnie has submitted the def for center calculation, but this is an objective common to data transformations such as mean calculation, median calculation and mode calculation. Averaging, as defined above, has to do with mean calculations. Alan's question could be restated as 'what's the difference between averaging and mean calculation', and indeed it's a reasonable question which I've asked myself as well and that's why the editorNote 2 above. I guess one reason to have averaging as objective is to distinguish goals in cases like that in which you do a dye-swap merge. You might be doing that with the simple goal of merging the data to obtain some average from yeach pair of dye-swaps or you might be doing a dye-swap merge (and can do that only when appropriate assumptions are met by the input) with the more ambitious goal of normalizing. I'm not sure this is enough to justify the need of having an averaging objective and we can certainly discuss this further.
4. Philippe, regarding your concerns on 'calculation', I guess you are referring to center, spread and moment calculation. I can't think of nice better names for these: 'center_calculation_objective' doesn't seem great... The reason we have these objectives was mostly to solve an issue of multiple inheritance.
Maybe it would be more efficient to further discuss all the above by voice at some point? Let me know if you want to set up a call. We probably should do it separately from our respective branch conference calls, but invite anybody interested to join in.
Original comment by: manduchi
Logged In: YES user_id=1155048 Originator: YES
In my last response I forgot to address one of Alan's questions, namely to quickly say how each proposed objective might help reducing multiple inheritance. Here are some examples:
1. A dye-swap merge can be used simply for averaging or it could be used for normalization. 2. hierarchical clustering (this term is slated for curation in our branch) could be used simply to build a tree (e.g. for phylogenetic analyses, another term to be curated in our branch) or to achieve an unsupervised classification (when one cuts the tree at some level). 3. A projection (e.g. filtering a vector based on characteristics of its components, like x1>0, x2<0, etc.) could be used simply to select components for further analyses (e.g. in microarray analyses) or it could be used to partition the space into two components (I guess this happens in flow cytometry where you use filters to partition a dataset). 4. a mean calculation is both a center calculation and a moment calculation. 5. a variance calculation is both a spread calculation and a moment calculation.
Original comment by: manduchi
Logged In: YES user_id=1155048 Originator: YES
One more example to be added to the list below: you might be performing a k-mean clustering (or any another type of clustering) with the intent of both doing an unsupervised and a supervised classifications. For example, you can discover classes through k-means (unsupervised classification), and then classify novel instances (supervised classification) using these classes, e.g. classifying each novel instance to the class whose centroid his closest to it. That is this double goal is achieved using the same k-means data transformation as first step. This was for example done in a famous paper years ago by Gollub et al on ALL and AML classification.
Original comment by: manduchi
Logged In: YES user_id=1155048 Originator: YES
Sorry for keeping adding to this thread, I guess this exchange got me on a roll... I was just thinking about an additional problem should we decide to have a 'classification' parent to supervised and unsupervised classification. It might well happen that in the future we will see the need to add an objective 'build a predictor', to use a name mentioned by Alan. Again, a data transformation that uses some model to predict values of a response variable would follow under this objective. If we had both 'classification' and 'build a predictor', then unsupervised classification would be a child of both. In my view there is a main problem here. On the one hand OBI, for whatever possibly very valid reason, wants to discourage multiple inheritance. I've only joined OBI later on (and focused on the DT aspect) so I don't feel I have enough knowledge of the arguments presented to this end to be able to cast a judgement on whether or not this is a good or necessary policy. On the other hand, at least for what concerns data transformations, multiple inheritance is a matter of fact. There are lots of different data transfomration that are in relationships more complex than those representable by a tree. So, in order to avoid multiple inheritance for data transformation, I'm afraid that we are forced into solutions (such as using objectives etc.) which appear to add more complications than if we could use multiple inheritance. Also I'm not sure such solutions would really be able to handle the problem as again we are trying to capture something where multiple inheritance is built in with approaches that try to get rid of the latter. I'm afraid that if we try to avoid multiple inheritance under data transformation, then we might at some point have to confront it again in objective...
Original comment by: manduchi
Logged In: YES user_id=1155048 Originator: YES
P.S. I meant *supervised* classification would be a child of both classification and build predictor (sorry for the typo).
Original comment by: manduchi
Logged In: YES user_id=1155048 Originator: YES
At our Data Transformation conference call on 02/21/08 we were joined by Alan R and further discussed some of these, focusing on classification, prediction and multiple inheritance issues. Here is the summary of things as they currently stand. (The detailed defs were not discussed during the call, so they reflect my proposals; but the hierarchy and concepts are as discussed at the call.)
1. Objective name: 'class discovery' (possible synonym 'unsupervised classification')
Def: Attaching class labels to input objects, where the number of classes and their specifications are not known a priori. The class assignment can be definite or probabilistic.
2. Objective name: 'build predictor' (or 'predictor building' or 'prediction', to keep names consistent with (3) below?)
Def: Construction of a mechanism to make predictions on attributes of novel input data based on what has been learned on training data.
3. Objective name: 'class prediction' (possible synonym 'supervised classification')
**Child of (2)**
Def: Creating a class predictor from training data through a machine learning techique. The training data consist of pairs of objects and class labels for these objects. The resulting predictor can be used to attach class labels to any valid novel input object. Depending on usage, the prediction can be definite or probabilistic. (A classification is learned from the training data and can then be tested on test data.)
4. 'classification': **defined/inferred** class as union of 'class discovery' and 'class prediction'.
5. 'normalization', 'averaging', 'partitioning' (as defined in the first email of this tracker item) were not discussed during the call. At least 'normalization' though had been agreed upon previously, so it should be retained as defined.
6. 'center calculation', 'spread calculation' and 'moment calculation' were not discussed during the call. We need them to avoid multiple inheritance (examples are provided in other comments to this tracker item). Definitions are to be provided by Monnie McGee.
Original comment by: manduchi
Assigned to James to verify.
Original comment by: bpeters42
Original comment by: bpeters42
We don't have moment calculation objective but I have no competency questions for it yet so we don't need it at moment. Everything else is in OBI now.
Original comment by: jamesmalone
Original comment by: jamesmalone
The following 8 new objective terms are needed by the DT branch. *For any proposed change to the definitions below please discuss with the DT branch.* Thanks.
1. preferred_term: normalization
definition: The objective of a data transformation aimed at removing systematic sources of variation to put the data on equal footing in order to create a common base for comparisons.
definition_editors: Elisabetta Manduchi, James Malone, Helen Parkinson
2. preferred_term: averaging
definition: The objective of a data transformation aimed at performing mean calculations on its inputs.
definition_editor: Elisabetta Manduchi
editorNotes: (1) mean calculation is a class defined in the data transformation hierarchy (2) this term might need revision as the data transformation branch evolves: we'll need to coordinate.
3. preferred_term: partitioning
definition: The objective of a data transformation aimed at generating a collection of disjoint non-empty subsets whose union equals a non-empty input set.
definition_editor: Elisabetta Manduchi
4. preferred_term: supervised classification
definition: The objective of a data_transformation aimed at creating a predictor from training data through a machine learning techique. The training data consist of pairs of objects (typically vectors of attributes) and class labels for these objects. The resulting predictor can be used to attach class labels to any valid novel input object. Depending on usage, the prediction can be definite or probabilistic. A classification is learned from the training data and can then be tested on test data.
definition_editor: Elisabetta Manduchi
editor_note: The above definition has been generated by combining and modifying the definitions at http://en.wikipedia.org/wiki/Supervised\_classification and http://www.csse.monash.edu.au/~lloyd/tildeMML/Structured/Supervised/
5. preferred_term: unsupervised classification
definition: The objective of a data_transformation aimed at organizing input objects (typically vectors of attributes) into classes, where the number of classes and their specifications are not known a priori. Depending on usage, the class assignment can be definite or probabilistic.
definition_editor: Elisabetta Manduchi
editor_notes: 1. The above definition has been generated by adapting the definition at http://www.csse.monash.edu.au/~lloyd/tildeMML/Structured/Unsupervised/ 2. I purposedly use the word "organization" and avoided "partition", in order to encompass fuzzy methods or in general methods that allow the resulting classes to possibly overlap.
6. preferred term: center calculation
definition: to be provided by Monnie McGee (both in the case of continuous and discrete random variables)
7. preferred term: spread calculation
definition: to be provided by Monnie McGee (both in the case of continuous and discrete random variables)
8. preferred term: moment calculation
definition: to be provided by Monnie McGee (both in the case of continuous and discrete random variables)
Reported by: manduchi
Original Ticket: "obi/obi-terms/19":https://sourceforge.net/p/obi/obi-terms/19