Rewrite reading course project code in python

gobbedy commented 6 years ago

TODO: 1) convex library -- DONE -- keep an eye on this issue: https://github.com/cvxgrp/cvxpy/issues/499: should get automated e-mail notification but just in case -- switch to cvxpy 1.0 after getting reply to above issue 2) make python standalone -- ensure same result as julia -- DONE 3) classes for smoother/weighters, class for main functions -- DONE 4) print results to file, clean up, add nice comments, add printing at start and end of functions -- HANDLED IN NEXT COMMENT -- script for automatic git syncing between linux and compute canada (?) + should load python3.6 module 5) setup github and try cloning to compute canada via github.com -- DONE 6) compare performance julia vs python -- use compute canada for this -- HANDLED IN NEXT COMMENT

gobbedy commented 6 years ago

remaining TODO

cleanup code + prep for acceleration ~~-- sync what I did yesterday on compute canada to github and vbox -- DONE~~ ~~-- reduce line size to make github look nice~~ --switch from numpy to tensor in entire code --try change code to L2 for speedup -- print results to file -- cleanup, esp gen_data and value_at_risk -- add nice comments to clarify what each function does (including list of args, etc as per style of weighter class + create smaller functions as needed -- check what smoother/weighter was used in first paper -- create a k nearest neighbor function, use my FAISS question to rename stuff ~~-- add logging and printing at start and end of functions: make sure self.name is included -- DONE~~ ~~-- put start and end function prints in decorator~~ ~~-- add timing decorator as well~~ ~~-- add decoration to the other classes as needed~~ ~~-- add profiling~~ ~~-- better name for "nearest_neighbors_learner"~~ ~~-- bashrc should load python3.6 module -- DONE~~ ~~-- get nedit working -- DONE~~ -- create a "golden model" whose output is known for known output; can then check performance enhancements against this model to ensure no bugs introduces: note, will have to add a switch to disable stochastic parts (anything with random or sample) -- then label it (figure out how to do this in git) ~~-- get profiling to run conditionally (eg only when a profiling argument passed in or only in debug mode)~~ -- create mahalanobis distance function for clarity -- benchmark vs julia? -- faiss for mahalanobis distance search?
accelerate code performance ~~-- explore which C-compiler (or other speedup technique) is best -- winner pytorch~~ ~~-- add python C-compilation as needed~~ -- explore running on multiple cores -- explore running on GPU if possible https://docs.computecanada.ca/wiki/PyTorch -- look into parallel techniques described here https://docs.computecanada.ca/wiki/Job_scheduling_policies#Whole_nodes_versus_cores

gobbedy commented 6 years ago

Rewriting to python is done. Remaining is performance enhancement. Creating new issue for this.

gobbedy / thesis-scratch

Rewrite reading course project code in python #7