sfu-db / dataprep

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
http://dataprep.ai
MIT License
1.99k stars 203 forks source link

Add selections in clean column and modify the UI #826

Closed yixuy closed 2 years ago

yixuy commented 2 years ago

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

This PR basically adds a new feature for the data clean frontend when the user clicks the selection for column and modify the UI design.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

import pandas as pd
import numpy as np
df = pd.DataFrame({"Name":
                   ["Abby", "Scott", "Scott", "Scott2", np.nan, "NULL"],
                   "AGE":
                   [12, 33, 33, 56,  np.nan, "NULL"],
                   "weight__":
                   [32.5, 47.1, 47.1, 55.2, np.nan, "NULL"],
                   "Admission Date":
                   ["2020-01-01", "2020-01-15", "2020-01-15",
                    "2020-09-01", pd.NaT, "NULL"],
                   "email_address":
                   ["abby@gmail.com","scott@gmail.com", "scott@gmail.com", "test@abc.com", np.nan, "NULL"],
                   "Country of Birth":
                   ["CA","Canada", "Canada", "NULL", np.nan, "NULL"],
                  "Contact (Numbers)":
                   ["1-789-456-0123","1-123-456-7890","1-123-456-7890","1-456-123-7890", np.nan, "NULL" ],

})
df

from dataprep.clean import clean_df_gui
clean_df_gui(df)

Please go to jupyter notebook and run the test code

Snapshots:

Include snapshots for easier review.

image

Checklist:

codecov[bot] commented 2 years ago

Codecov Report

Merging #826 (f025b43) into develop (482ca40) will decrease coverage by 0.66%. The diff coverage is 7.34%.

:exclamation: Current head f025b43 differs from pull request most recent head e2f78fc. Consider uploading reports for the commit e2f78fc to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #826      +/-   ##
===========================================
- Coverage    55.32%   54.65%   -0.67%     
===========================================
  Files          293      293              
  Lines        18993    19239     +246     
===========================================
+ Hits         10507    10515       +8     
- Misses        8486     8724     +238     
Impacted Files Coverage Δ
dataprep/clean/clean_url.py 97.84% <ø> (ø)
dataprep/clean/gui/clean_gui.py 11.17% <0.00%> (-9.54%) :arrow_down:
dataprep/eda/create_diff_report/__init__.py 50.00% <0.00%> (ø)
dataprep/eda/create_diff_report/diff_formatter.py 16.58% <0.00%> (ø)
dataprep/eda/correlation/render.py 98.69% <100.00%> (+0.02%) :arrow_up:
dataprep/eda/distribution/render.py 90.73% <100.00%> (ø)
dataprep/tests/eda/test_create_report.py 87.87% <100.00%> (+0.78%) :arrow_up:
dataprep/tests/eda/test_plot.py 100.00% <100.00%> (ø)
dataprep/eda/correlation/compute/overview.py 98.49% <0.00%> (-0.76%) :arrow_down:
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ba961a1...e2f78fc. Read the comment docs.