gobbedy / thesis-scratch

Any/all work related to thesis -- a bit of a scratch space
0 stars 0 forks source link

Rewrite reading course project code in python #7

Open gobbedy opened 6 years ago

gobbedy commented 6 years ago

TODO: 1) convex library -- DONE -- keep an eye on this issue: https://github.com/cvxgrp/cvxpy/issues/499: should get automated e-mail notification but just in case -- switch to cvxpy 1.0 after getting reply to above issue 2) make python standalone -- ensure same result as julia -- DONE 3) classes for smoother/weighters, class for main functions -- DONE 4) print results to file, clean up, add nice comments, add printing at start and end of functions -- HANDLED IN NEXT COMMENT -- script for automatic git syncing between linux and compute canada (?) + should load python3.6 module 5) setup github and try cloning to compute canada via github.com -- DONE 6) compare performance julia vs python -- use compute canada for this -- HANDLED IN NEXT COMMENT

gobbedy commented 6 years ago

remaining TODO

  1. cleanup code + prep for acceleration -- sync what I did yesterday on compute canada to github and vbox -- DONE -- reduce line size to make github look nice --switch from numpy to tensor in entire code --try change code to L2 for speedup -- print results to file -- cleanup, esp gen_data and value_at_risk -- add nice comments to clarify what each function does (including list of args, etc as per style of weighter class + create smaller functions as needed -- check what smoother/weighter was used in first paper -- create a k nearest neighbor function, use my FAISS question to rename stuff -- add logging and printing at start and end of functions: make sure self.name is included -- DONE -- put start and end function prints in decorator -- add timing decorator as well -- add decoration to the other classes as needed -- add profiling -- better name for "nearest_neighbors_learner" -- bashrc should load python3.6 module -- DONE -- get nedit working -- DONE -- create a "golden model" whose output is known for known output; can then check performance enhancements against this model to ensure no bugs introduces: note, will have to add a switch to disable stochastic parts (anything with random or sample) -- then label it (figure out how to do this in git) -- get profiling to run conditionally (eg only when a profiling argument passed in or only in debug mode) -- create mahalanobis distance function for clarity -- benchmark vs julia? -- faiss for mahalanobis distance search?
  2. accelerate code performance -- explore which C-compiler (or other speedup technique) is best -- winner pytorch -- add python C-compilation as needed -- explore running on multiple cores -- explore running on GPU if possible https://docs.computecanada.ca/wiki/PyTorch -- look into parallel techniques described here https://docs.computecanada.ca/wiki/Job_scheduling_policies#Whole_nodes_versus_cores
gobbedy commented 6 years ago

Rewriting to python is done. Remaining is performance enhancement. Creating new issue for this.