HealthCatalyst / healthcareai-py

Python tools for healthcare machine learning
http://healthcare.ai
MIT License
315 stars 188 forks source link

Add Pandas-Profiling data profiler #384

Open yvanhuele opened 7 years ago

yvanhuele commented 7 years ago

Include pandas-profiling

Desired User Experience (Open to Feedback)

  1. A user (who presumably has data in a dataframe) instantiates an instance of trainer = SupervisedModelTrainer(...all the args)
  2. The user can then call the method trainer.data_profile_report() (or better more logical name)
  3. This method calls pandas_profiler and creates the report. By default it should save the html report with an ISO 8601 timestamp as the name. For example profile_report_2017-12-10T05-33-53.html
  4. This method should print out to the console the name of the file that was saved and the full path to the file.
  5. If the user specifies a filename= argument, the method should save that accordingly.

Other Notes

Aylr commented 6 years ago

Is there code-sharing potential to #420 ?