Lightning-Universe / lightning-transformers

Flexible components pairing 🤗 Transformers with :zap: Pytorch Lightning
https://lightning-transformers.readthedocs.io
Apache License 2.0
610 stars 77 forks source link

Clarification of Documentation for custom tasks #169

Closed vikigenius closed 3 years ago

vikigenius commented 3 years ago

I was going through the documentation for custom tasks:

It says:

Typically you’d store the file within the lightning_transformers/task/ directory Where is this directory located? Are we supposed to fork the lightning-transformers directory?

Is it possible to just replicate this directory structure in my local project and then use the cli?

Is it possible to create a custom task itself? For example, if i want to use a transformer model to perform a regression task for example. Can i just create a folder for regression and follow the example ?

Also, additionally, is it possible to save the trained model to torchscript/onnx?

SeanNaren commented 3 years ago

Thanks for your issue!

Typically you'd clone this project locally, and you can use the CLI or the scripts. In doing so, you are able to put your code into the appropriate task https://github.com/PyTorchLightning/lightning-transformers/tree/master/lightning_transformers/task/nlp

Could you go into more detail as to what your regression task is?

I haven't confirmed this, but the LightningModule does support these: https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_module.html#to-torchscript

This should mean after training, you should be able to call model.to_torchscript().

EDIT: I'll try to clear this up in the documentation to make it verbose!

vikigenius commented 3 years ago

@SeanNaren , Take a look at the Kaggle common lit readability challenge: https://www.kaggle.com/c/commonlitreadabilityprize/data?select=train.csv

I just want to try a simple baseline of finteuning a pretrained transformer model to produce embeddings, attach a feedforward layer at the end to predict the readability value, this is very similar to classification, except that you don't predict classes, just a single value per input.

vikigenius commented 3 years ago

@SeanNaren Also, your answer makes sense when you clone the project, but what happens when you pip install the project? Where do we create the new task then?

SeanNaren commented 3 years ago

It's a very good point @vikigenius, we're still iterating to make this better, but the ideal situation for modifications is when you clone the project. We should make this clearer! Confs are stored in the pypi package, but this is being refactored slightly to be cleaner.

I've made an issue #174 to try cover the use cases. I Think this repo needs to do a better job at explaining the workflow for different use cases!

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.