google / sling

SLING - A natural language frame semantics parser
Apache License 2.0
1.93k stars 268 forks source link

Modify internal architecture of the LSTMs and FF unit #214

Closed satya-323 closed 6 years ago

satya-323 commented 6 years ago

Hi, I have been training the model on GENIA corpus to extract entities such as proteins, genes etc. from medical articles. I wanted to modify the internal architecture of the neural networks in the model to see if it would result in increase in F1 score. It would be of great help if you could guide as to where this architecture is defined.

johndpope commented 6 years ago

(presumably you've seen this http://neuroner.com/ which does named-entity recognition ? )

It's f1 score is world class https://github.com/Franck-Dernoncourt/NeuroNER/blob/master/trained_models/performances.md just need good training datasets.

rahul1980 commented 6 years ago

Assuming you are working with the master branch, the LSTM cell is defined here: https://github.com/google/sling/blob/b6834cc9d6ab3a04051dad8f9e819b98028280d3/third_party/syntaxnet/dragnn/python/network_units.py#L987

and the FF unit is defined here: https://github.com/google/sling/blob/b6834cc9d6ab3a04051dad8f9e819b98028280d3/third_party/syntaxnet/dragnn/python/network_units.py#L781

The spec which puts together the architecture is generated here: https://github.com/google/sling/blob/4c77371612ab42c8b3ecaeb4cf6598b5000437d8/sling/nlp/parser/trainer/generate-master-spec.cc#L330 (left-right LSTM), and a few lines below for the right-left LSTM. and https://github.com/google/sling/blob/4c77371612ab42c8b3ecaeb4cf6598b5000437d8/sling/nlp/parser/trainer/generate-master-spec.cc#L365 for the feed-forward unit.

Training with a modified architecture shouldn't be a problem, and the tf-parse annotation tool should also work. However the faster Myelin-based parser tool may not necessarily work (depending on the modification).

rahul1980 commented 6 years ago

Besides modifying the architecture, please also consider hyperparameter tuning. The current hyperparameter defaults are what worked well for the model we released. You may want to play around with the learning rate, Adam optimizer hyperparams, dropout etc. Full list here: https://github.com/google/sling#specify-training-options-and-hyperparameters

satya-323 commented 6 years ago

Thanks a lot for the suggestions.