Closed daxenberger closed 8 years ago
Agreed. See comments on r704.
I am particularly interested in moving the calculations out of BatchCrossValidationReport,
and into an Object that all Reports can access.
Reported by EmilyKJamison
on 2014-03-21 18:25:17
A task may be a bit heavy, but creating some separate classes for data structures commonly
used in reports and for evaluations over these data structures appears sensible. The
structures and functionality could easily be reused and the reports would become more
light-weight.
Reported by richard.eckart
on 2014-03-23 12:44:03
We need to create a central Evaluation Object in TC, which will serve as a connector
between the machine learning framework and the evaluation (i.e. all reports).
Why?
1) This Evaluation Object can be used to create a file in a format which can be imported
into an external program to do significance tests (issue 112).
2) To avoid bias in the aggregation of results from CV folds, an overall confusion
matrix should be created which is used to further calculate F1 etc. An Evaluation Object
can also hold the overall confusion matrix (issue 113).
Reported by daxenberger.j
on 2014-04-22 15:42:58
Reported by daxenberger.j
on 2014-04-22 15:48:43
Reported by daxenberger.j
on 2014-04-22 15:49:15
Reported by daxenberger.j
on 2014-06-04 12:34:49
Reported by daxenberger.j
on 2014-09-05 08:45:17
Started
This issue was updated by r1133 and r1134.
Reported by daxenberger.j
on 2014-10-08 09:12:56
This issue was updated by revision r1136.
tests for soft/strict evaluation.
Reported by daxenberger.j
on 2014-10-08 09:48:59
This issue was updated by revision r1137.
adding TODOs.
Reported by daxenberger.j
on 2014-10-08 10:00:34
This issue was updated by revision r1289.
introducing a generic multi-label result wrapper to work with the latest version of
meka and DKPro TC's new evaluation module
Reported by daxenberger.j
on 2014-12-10 08:54:40
This issue was updated by revision r1359.
Created a test for the new evaluation report, in part. outcome id report
Reported by daxenberger.j
on 2015-03-17 11:47:34
This issue was updated by revision r1366.
Expanding functionality of the evaluation module/helper classes.
Work in progress.
Reported by daxenberger.j
on 2015-03-17 16:16:36
Reported by daxenberger.j
on 2015-03-27 13:05:34
Good reference for multi-label evaluation and calculation of scores: http://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics
Reported by daxenberger.j
on 2015-03-30 08:24:47
This issue was updated by revision r1396.
restructuring evaluation module; mostly multi-label part
several tests are broken/ignored atm, needs to be investigated
Reported by daxenberger.j
on 2015-04-02 16:04:35
This issue was updated by revision r1400.
enabling crossvalidation setup with evaluation module
adding test for crossvalidation setup with evaluation module
Reported by daxenberger.j
on 2015-04-03 15:09:25
This issue was updated by revision r1404.
copy all relevant discriminators into CV report on batch task level;
javadoc
Reported by daxenberger.j
on 2015-04-07 14:05:51
This issue was updated by revisions r1484 and r1485.
Added new measures; corrected calculation of multi-label scores; documentation
Reported by daxenberger.j
on 2015-05-18 10:25:04
@daxenberger Is there anything left to do or should we move this one to 0.9.0 ?
move to the next milestone. this hasn't been tested and integrated properly yet.
If you tell me what is left to do or where to start I would continue integration. Which is the next step here?
A rough roadmap:
@daxenberger I started a new branch making SvmHmm my guinea pig. Svmhmm outputs some own confusion matrix stuff. This can be removed now, too? I am not sure if it is even correct since this also untested but it should be handled by the new evaluation module now anyway?
Except for the additional files SvmHmm creates the integration is not soo hard. I can remove SVMHMMClassificationReport
and SVMHMMBatchCrossValidationReport
once the interface is adapted, right?
[...] but it should be handled by the new evaluation module now anyway?
yes, in theory. I'm not sure how well this works atm, so maybe deprecate the old reports rather than removing them completely.
I can remove SVMHMMClassificationReport and SVMHMMBatchCrossValidationReport once the interface is adapted, right?
Some here: I'd prefer to remove them from demos etc., but instead of completely deleting them, better deprecate.
What do we do with the Mallet
module. It is deprecated
since 0.7.0
. All changes to the API have to implemented there too. Maybe its time to remove it entirely? Who ever needs Mallet should either use 0.7.0 or 0.8.0 ?
Why was it deprecated?
I think the code is a bit messy some things never seemed to have worked (sequence classification) and we have Weka and Crfsuite which covers everything. If someone wants to revive it it should be reimplemented from scratch imo.
Very slow. Was not really usable for real problems. Tobias Horsmann notifications@github.com schrieb am Sa., 7. Mai 2016 um 13:33:
I think the code is a bit messy some things never seemed to have worked (sequence classification) and we have Weka and Crfsuite which covers everything. If someone wants to revive it it should be reimplemented from scratch imo.
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/dkpro/dkpro-tc/issues/50#issuecomment-217630216
@daxenberger all MLA are supposed to have as last entry threshold
in the id2outcome file? What value do I set if the MLA doesn't has such a parameter?
yeah that is necessary to have a common format for all kind of learning modes. theshold
will be ignored if multi-labeling is not applied and should thus set be set to -1 (or 0).
@daxenberger ok thx. Do we adapt Mallet too? Maybe its time to remove it entirely?
How much effort is it to drag it along? Tobias Horsmann notifications@github.com schrieb am So., 8. Mai 2016 um 12:51:
@daxenberger https://github.com/daxenberger ok thx. Do we adapt Mallet too? Maybe its time to remove it entirely?
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/dkpro/dkpro-tc/issues/50#issuecomment-217708964
About the same as for the other modules - a have day. I am more concerned to keep maintaining a module that is dead since 2 releases. Changes have been reflected for Mallet so far if they affected interfaces and made it mandatory to change it there too (otherwise Jenkins would fail) but there are no unit tests for the module. I find its whole state a bit questionable imo it is rotting code that causes effort in every change and no one(?) has any advantage of the effort spend. So I would suggest to remove it :)
Ok for me Tobias Horsmann notifications@github.com schrieb am So., 8. Mai 2016 um 13:31:
About the same as for the other modules - a have day. I am more concerned to keep maintaining a module that is dead since 2 releases. Changes have been reflected for Mallet so far if they affected interfaces and made it mandatory to change it there too (otherwise Jenkins would fail) but there are no unit tests for the module. I find its whole state a bit questionable imo it is rotting code that causes effort in every change and no one(?) has any advantage of the effort spend. So I would suggest to remove it :)
— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/dkpro/dkpro-tc/issues/50#issuecomment-217710632
Might be worth noting that there is a new version of mallet since a few days:
http://search.maven.org/#artifactdetails%7Ccc.mallet%7Cmallet%7C2.0.8%7Cjar
What I have done so far
*usingTCEvaluationReport
reports main reports BatchTrainTestReport
and BatchCrossvalidationReport
Ok here a list of issues I am not sure how to handle
WekaRegressionExperimentTest
triggers an exception when calling a getLabel()
method - no regression implemented yet?Mallet
module - see previous postings - is untouched at the momentI can't fix the Groovy Test cases - support is not available for my Eclipse version @daxenberger can you fix those?
For curiosity, what Eclipse version are you using?
4.4 Luna this one isn't working https://marketplace.eclipse.org/content/groovygrails-tool-suite-ggts-eclipse
I have been using Luna before and now I am using Mars.
For Luna: http://dist.springsource.org/release/GRECLIPSE/e4.4/ For Mars: http://dist.springsource.org/snapshot/GRECLIPSE/e4.5/
sry, not working I tried those too. Installation fails with an exception.
I installed another Eclipse version - I don't understand was those Groovy test cases are supposed to do which makes it hard to fix it. I could need a hand for those... @daxenberger help wanted
I can't fix the Groovy Test cases - support is not available for my Eclipse version @daxenberger can you fix those?
I'm compiling the groovy demos without problems under Eclipse Mars (4.5.2) with Groovy Compiler (1.8-2.4) 2.9.2.xx
WekaRegressionExperimentTest triggers an exception when calling a getLabel() method - no regression implemented yet?
See my reponse on the mailing list - no getLabel() method for regression.
Mallet module - see previous postings - is untouched at the moment
I am a bit reluctant to totally remove the module since Mallet has become more active recently (see above). Maybe move the module into it's own branch?
Weka has various other *Adapter classes with additionally reports - how to handle those?
The Prediction adapters are not needed anymore, since we have Save/LoadModel now. Statistics adapters are already ported to the new evaluation mode. Meka and Weka needs to be distinguished due to differences in single/multi-label mode.
I installed another Eclipse version - I don't understand was those Groovy test cases are supposed to do which makes it hard to fix it. I could need a hand for those... @daxenberger help wanted
which tests do you refer to? The tests in the groovy demo module to the same as in the java module: simply execute the demos.
the PairTwentyNewsgroupsDemo
fails with a Stanford triggered ClassCastException on my machine?
The regression demo should fail too due to the label()
issue.
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to [Ledu.stanford.nlp.util.Index;
at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2164)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1249)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1226)
at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2278)
at de.tudarmstadt.ukp.dkpro.core.stanfordnlp.StanfordNamedEntityRecognizer$1.produceResource(StanfordNamedEntityRecognizer.java:170)
at de.tudarmstadt.ukp.dkpro.core.stanfordnlp.StanfordNamedEntityRecognizer$1.produceResource(StanfordNamedEntityRecognizer.java:141)```
@daxenberger
Regarding mallet: I can fork-off a new branch from master
with the current state. I would then remove the mallet module from my issue50
branch which removes the module once it is merged into master
the PairTwentyNewsgroupsDemo fails with a Stanford triggered ClassCastException on my machine?
ok; I'll have a look at this
Regarding mallet: I can fork-off a new branch from master with the current state. I would then remove the mallet module from my issue50 branch which removes the module once it is merged into master
sounds good
Oookay. I set up a Jenkins job and it seems Jenkins does not have the problems I have with the Groovy experiments. I think I am quite close to merging this branch into master. Seemingly everything is working. I added 2 of the most easy to implement regression measures for a few simple tests. The other bugs will probably only show themselves when we are actually start using the module.
I merged the changes into master - an open todo on the checklist is the measure implementations (for regression).
@daxenberger I have the WekaFeatureValuesReport
as left-over. The report used the former result.txt
if this report is still needed you would have to upgrade it to using the new module. Not really sure what it is suppose to do.
Maybe we start an own issue for adding the remaining measures?
Originally reported on Google Code with ID 50
Reported by
daxenberger.j
on 2013-09-17 16:41:45Edit by Tobias Horsmann - ToDo list: