Closed dirkweissenborn closed 6 years ago
@riedelcastro @TimDettmers could you take a look at this?
@pminervini how can I add pytorch as a requirement such that it plays well together with setuptools? For pytorch we only get a link to whl at the moment here. I tried adding the following to requirements.txt (which works fine with pip install -r
):
http://download.pytorch.org/whl/cu75/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl ; sys_platform == "linux"
http://download.pytorch.org/whl/torch-0.2.0.post3-cp35-cp35m-macosx_10_7_x86_64.whl ; sys_platform == "darwin"
Merging #301 into master will decrease coverage by
7.45%
. The diff coverage is24.19%
.
@@ Coverage Diff @@
## master #301 +/- ##
==========================================
- Coverage 60.77% 53.32% -7.46%
==========================================
Files 75 85 +10
Lines 4352 4923 +571
==========================================
- Hits 2645 2625 -20
- Misses 1707 2298 +591
Impacted Files | Coverage Δ | |
---|---|---|
jack/core/model_module.py | 75.75% <ø> (+0.34%) |
:arrow_up: |
jack/torch_util/segment.py | 0% <0%> (ø) |
|
jack/torch_util/highway.py | 0% <0%> (ø) |
|
jack/core/torch.py | 0% <0%> (ø) |
|
jack/torch_util/xqa.py | 0% <0%> (ø) |
|
jack/readers/extractive_qa/torch/fastqa.py | 0% <0%> (ø) |
|
jack/torch_util/embedding.py | 0% <0%> (ø) |
|
jack/torch_util/rnn.py | 0% <0%> (ø) |
|
jack/torch_util/misc.py | 0% <0%> (ø) |
|
jack/readers/extractive_qa/tensorflow/fastqa.py | 91.17% <100%> (ø) |
|
... and 26 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 28c1d13...0f03066. Read the comment docs.
I skimmed the code and it looks okay. I would be nice if you would work more with template method patterns, it seems as made for the job: Implement abstract classes that implement the algorithm and defer tensorflow/pytorch specifics to their subclasses. This would mean having no specific ModelModule
etc. but only one general one. If we add another model we also only need to implement it once instead of twice for tensorflow and pytorch, which would be a big bonus.
This would be much more work but once it stands you never have to change the abstract classes, and it is not necessary to implement all the pytorch code. This way you have stubs like conv_char_embed(...)
or any other function which people can implement if they want to have their favorite model in pytorch.
This way any of us does not need to learn the pytorch abstractions, they just need to look which stub throws the NotImplemented
error and just implement that stub.
This would also make it trivial to extend the framework to more than two languages since you do not have to duplicate code, ModelModules etc.
On the other hand, we want to push this to the public soon and the added time might prevent this from having pytorch support at all if we choose the template pattern approach. However, if we do not do this now, it will be almost impossible to refactor and might hinder progress in this area in the future.
@TimDettmers Thanks for the input!
Do I understand correctly that you want to implement our models independent of framework? If yes, I think this is not a good idea because we would have to generalize all functionality of TF and PyTorch in jack, right? This is really problematic actually because they are not easy to align with each other in some cases. It is also hard to generalize the 2 workflows more than we have at the moment in the ModelModules and JTReaders, because TF is static and pyTorch is dynamic.
I think having the option to use either pyTorch or TF to implement your model is the big plus, not that you can run your model in either pyTorch or TF. Now we have 2 implementations of FastQA as an example but normally we should only have one implementation of a model in a single of the supported frameworks because that is enough to use Jack for training/evaluation/application of that reader.
Indeed it would be an issue because you have to unify many operations. However, I do not think we would need to do this for any operation. Something like LSTM, char embeddings, self-attention are well-defined building blocks. For things like attention and biattentions there are more variants, but I think it would still be possible. If implemented on this level it might be possible. But overall I agree this would be a mess.
One issue might also be how to access the input variables if you go with this approach since you have to unify pytorch and tensorflow variables.
If I think about it again, this approach feels more like reimplementing Keras rather than an easy model for our framework. But maybe at some places, this makes sense still.
@pminervini how can I add pytorch as a requirement such that it plays well together with setuptools? For pytorch we only get a link to whl at the moment here.
@dirkweissenborn , what about adding torch
to requirements.txt
?
https://github.com/pytorch/examples/blob/master/imagenet/requirements.txt
@pminervini there is no support for this yet.
254
pytorch integration. Main changes made:
PyTorchModelModule
andPyTorchReader
train_reader.py
supports bothtrain_tensorflow
andtrain_torch
jack.core
jack.readers.extractive_qa.fastqa_torch
TODO:
Example
The following training works:
It reuses the same input and output module as
fastqa_reader
and is about 1.5x as fast as the TF implementation.