mathemusician commented 3 years ago

Closes #196

Add option to use sparseml. This is still a little rough around the edges, but I welcome the feedback! Implementation on Google Colab.

Still needs:

1) Implementation of MODELS_PATH and RECIPE_PATH through Hydra instead of environment variable 2) "test" stage needs to work, only training and evaluation work at the moment 3) stylistic coding changes

Will continue to add more changes after my reactions engineering exam tomorrow. Feel free to give me pointers or links to tutorials if you see any changes I need to make.

codecov[bot] commented 3 years ago

Codecov Report

Merging #197 (2788bea) into master (6c9b495) will decrease coverage by 2.94%. The diff coverage is 49.57%.

@@            Coverage Diff             @@
##           master     #197      +/-   ##
==========================================
- Coverage   90.21%   87.27%   -2.95%     
==========================================
  Files          71       73       +2     
  Lines        1543     1658     +115     
==========================================
+ Hits         1392     1447      +55     
- Misses        151      211      +60

Flag	Coverage Δ
unittests	`87.27% <49.57%> (-2.95%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
lightning_transformers/core/callback.py	`28.57% <25.67%> (-12.61%)`	:arrow_down:
lightning_transformers/core/loggers.py	`87.50% <87.50%> (ø)`
...transformers/task/nlp/text_classification/model.py	`97.61% <87.50%> (-2.39%)`	:arrow_down:
lightning_transformers/utilities/imports.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6c9b495...2788bea. Read the comment docs.

mathemusician commented 3 years ago

Wow, that's a lot of red. I think I may be able to get rid of the dependency to pytorch-lightning-bolts, which isn't even listed as a dependency yet. Also, @SeanNaren I've tested sparseml with torch 1.9, and it works!? I'll do a more thorough test after my test tomorrow and add the appropriate changes.

SeanNaren commented 3 years ago

Thanks @mathemusician I'll be able to assist next week :)

mathemusician commented 3 years ago

Figured out why the "test" stage doesn't work. The labels given to bert during the test stage are 'labels': tensor([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1]). When bert tries to calculate cross-entropy loss, it fails immediately. In order to fix this, I added a check in task/nlp/text_classification/model.py to check for -1 in labels; if there is -1 it replaces labels with None. Test stage now works splendidly.

mathemusician commented 3 years ago

Also, I figured out how to change the recipe path and output directory after instantiation by hydra:

++trainer.callbacks.output_dir=/content/MODELS            \
++trainer.callbacks.recipe_path=/content/recipe.yaml      \

I'm getting rid of the dependency on environment variables

mergify[bot] commented 3 years ago

This pull request is now in conflict... :(

SeanNaren commented 3 years ago

PR looks good! Any chance you could add a quick test for one of the tasks to ensure the callback works end to end?

mathemusician commented 3 years ago

PR looks good! Any chance you could add a quick test for one of the tasks to ensure the callback works end to end?

I'll try, it's my first time writing tests. Gotta start somewhere.

SeanNaren commented 3 years ago

PR looks good! Any chance you could add a quick test for one of the tasks to ensure the callback works end to end?

I'll try, it's my first time writing tests. Gotta start somewhere.

let me know if you get stuck, more than happy to help out! Code is really well thought out so far, so I have no hesitations that you can't do it :D

mathemusician commented 3 years ago

The logger and callback tests need certain imports that aren't in requirements.txt. But I wasn't sure if we wanted to add more import statements. The logger test needs wandb; and the sparseml callback test needs pytorch 1.7-1.8.

mathemusician commented 3 years ago

Should I add wandb to requirements.txt? Also, I'm going to update the docs with the different ways to use sparseml soon

SeanNaren commented 3 years ago

Thanks for your hard work @mathemusician hopefully tests should pass now, and we can merge :)

Lightning-Universe / lightning-transformers

sparseml integration #197

Codecov Report