rtbs-dev / text-data-spr22

Class Exercises for PPOL628 Spring 2022
MIT License
0 stars 2 forks source link

hw_2_pipeline. Analysis still needs a bunch of work but dvc workflow … #22

Open yousufabdelfatah opened 2 years ago

yousufabdelfatah commented 2 years ago

…working pretty well

rtbs-dev commented 2 years ago

Good workflow. Generally I would advise against storing big "state" variables like "X", etc in a python file to import. It will trigger the entire script to re-run and ties your functions to the artifacts of that code necessarily. Use the ConfusionMatrizDisplay.from_estimator(my_saved_model) way instead, which keeps you from needing to re-import X,y, etc all the time. All you will need is the original dataframe!

Also, please remove all the files outside of your folder (or the env.yml changes), since those might conflict with other students' work.