Closed profversaggi closed 6 years ago
You can certainly request it :)
We definitely want to do SRL. At the moment the following tasks are higher priority:
The good news is that velocity is currently pretty good. The bad news is that it's still hard to hand over these tasks to others, so things are mostly happening in serial.
Unfortunately I can't really give you an estimate for when SRL might be done.
No worries! The SpaCy framework is pretty awesome as it is so we'll use what we can and patiently wait in the Q of tasks to be implemented. Keep up the good work!
On Wed, Nov 11, 2015 at 12:42 PM, Matthew Honnibal <notifications@github.com
wrote:
You can certainly request it :)
We definitely want to do SRL. At the moment the following tasks are higher priority:
- Improved CI framework, running on our own test server, as Travis CI doesn't give us enough memory to test with the models
- Better NER, particularly using large phrase dictionaries acquired from Wikipedia
- Multi-lingual support
- Better data parallelism, using Spark, and multi-threading
The good news is that velocity is currently pretty good. The bad news is that it's still hard to hand over these tasks to others, so things are mostly happening in serial.
Unfortunately I can't really give you an estimate for when SRL might be done.
— Reply to this email directly or view it on GitHub https://github.com/honnibal/spaCy/issues/170#issuecomment-155858150.
######################################################### Matthew R. Versaggi, Artificial Intelligence Engineer, Imagine One, LTD President & CEO: Versaggi Information Systems, Inc. Adjunct Professor of eBusiness DePaul University Email: mailto:matt@versaggi.com, ProfVersaggi@gmail.com M: 630-292-8422 LinkedIn: http://www.linkedin.com/in/versaggi About Me: http://www.matt-versaggi.com/resume/ #########################################################
+1
Referencing #60 , original comment:
Well, the good news is there's lots of good stuff coming. The bad news is
it's pushed SRL down a bit.
- Knowledge-based NER
- Multi-lingual
- Stablise 1.0 API
- Domain adaptation
- Theano integration, neural network models
- SRL
The better news is SRL isn't so much work, given recent research. If you
can put in a weekend or two we could probably get this done:
http://alt.qcri.org/semeval2014/cdrom/pdf/SemEval034.pdf
The idea is to learn the SRL as a projective tree, by giving up on some of
the relations.
What we need:
- Survey the papers implementing similar tree approximations
- Pick the best one
- Implement the data transform
If you can do that initial spadework, I'd be happy to run the experiments.
I can supply sample data for the transformation.
@honnibal I might give this a shot, would you still recommend the tree approximation approach?
Awesome! Some great things you guys got going on there ! :-)
On Wed, Apr 13, 2016 at 3:46 PM, Scott Li notifications@github.com wrote:
Referencing #60 https://github.com/spacy-io/spaCy/issues/60 , original comment:
Well, the good news is there's lots of good stuff coming. The bad news is it's pushed SRL down a bit.
- Knowledge-based NER
- Multi-lingual
- Stablise 1.0 API
- Domain adaptation
- Theano integration, neural network models
- SRL
The better news is SRL isn't so much work, given recent research. If you can put in a weekend or two we could probably get this done:http://alt.qcri.org/semeval2014/cdrom/pdf/SemEval034.pdf
The idea is to learn the SRL as a projective tree, by giving up on some of the relations.
What we need:
- Survey the papers implementing similar tree approximations
- Pick the best one
- Implement the data transform
If you can do that initial spadework, I'd be happy to run the experiments. I can supply sample data for the transformation.
@honnibal https://github.com/honnibal I might give this a shot, would you still recommend the tree approximation approach?
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/spacy-io/spaCy/issues/170#issuecomment-209618923
######################################################### Matthew R. Versaggi, Artificial Intelligence Engineer, Imagine One, LTD President & CEO: Versaggi Information Systems, Inc. Adjunct Professor of eBusiness DePaul University Email: mailto:matt@versaggi.com, ProfVersaggi@gmail.com M: 630-292-8422 LinkedIn: http://www.linkedin.com/in/versaggi About Me: http://www.matt-versaggi.com/resume/ #########################################################
I'd still recommend the tree approximation approach, yes. We'd be excited to have you working on this functionality, so @wbwseeker and I will be happy to support you. The main complication is, do you have access to the SRL data? We're not licensed to distribute this to you. We could work around it by putting up a quick API for you to train the model, and giving you some test data to develop with.
Data issues aside, I would suggest the following strategy:
The big question is that the SRL really wants a different API. How should these predicate-argument structures be consumed? And how can we make it easy to move between the SRL annotation and the other annotations spaCy provides?
Probably I would suggest lettng the SRL functionality live as a separate module for a while. We could release this on PyPi, and let the API evolve. This way you can just write whatever you need for the moment, and not worry about the One True Solution. When it's evolved and stabilised we can integrate it back into the main library.
Are you referring to the CoNLL 2009 data?
It seems the CoNLL 2012 data is available for download. Would this be appropriate?
Doesn't that require OntoNotes? OntoNotes isn't available for download.
@scottyli Looks fascinating. Did you end up building this out?
Any progress on this front? I would be interested in helping if needed.
Quick update: This might be a nice use case for the new custom processing pipeline components and extension attributes introduced in v2.0!
Merging this with the newer #2336!
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
I'm VERY impressed with the speed and accuracy of the NER functionality and an only using SRL elsewhere because it doesn't exist her. May I formally request it's inclusion in the next major release?