sfu-db / dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
http://dataprep.ai
MIT License
2.03k stars 204 forks source link

Feat/add save functionality #892

Closed sahmad11 closed 2 years ago

sahmad11 commented 2 years ago

Description

We have added optional parameters to the create_db_report function API. The save parameter is boolean parameter which the user passes if they want to save the report. The save_path parameter if specified would save the report in a directory of the user's choice, otherwise the report will be saved in the current working directory.

How Has This Been Tested?

Used the same same test scripts and the documentation examples for testing.

Checklist:

jinglinpeng commented 2 years ago

@khoatxp @sahmad11 Good job! One comment is that I think it's better to make the API consistent with the create_report. Basically the output of create_report(df) is a Report, and a Report has the show, save and show_browser method. In this way, user can call create_report(df).show() to display the report inside the notebook, create_report(df).show_browser() to display the report in the browser, and create_report(df).save('xx.html') to save the report. Please take a look at the code of Report: https://github.com/sfu-db/dataprep/blob/develop/dataprep/eda/create_report/report.py

jinglinpeng commented 2 years ago

Hi @khoatxp , it seems the saved file is expected to be a zip file. Without knowing this, a user may call report.save('report.html') or report.save('report'), but get a html file or file without extension name, which cannot be opened. Given this, is it possible to just output a folder for the save function. E.g., even when user calls report.save('report.html'), the output is still a folder named report.html.