Closed diego-lopez8 closed 8 months ago
please also clean up the comments as the TODOs are mostly completed
@diego-lopez8 Can I remove the code (from last semester) that we're not using?
Yes!
@diego-lopez8 Done. Please check the latest PR.
completed
There are a lot of optimizations we can make to the
preprocess_json()
function. Because we are already selecting on the fields in themakedf_samecol()
function, there is less of a need to spend cycles actually dropping columns from the dataframe. We should try to clean up as much of these as possible.The end goal should be changing everything to numpy. That is not this ticket, lets just make pandas run as fast as we can before spending enormous energy changing it to numpy.