Implement clean_json() functionality to parse and clean JSON file
Design-level Explanation
[x] Investigate approaches for matching and validating JSON files.
[x] Discuss output formats to be supported.
Design-level Explanation Actions
def clean_json(
df: Union[pd.DataFrame, dd.DataFrame],
col: str,
fix_missing: str = "empty",
split: bool = False,
inplace: bool = False,
report: bool = True,
errors: str = "coerce",
) -> pd.DataFrame:
"""
This function cleans JSON string.
Parameters
----------
df
Pandas or Dask DataFrame.
col
Column name containing JSON.
split
If True, split a column containing a JSON into different
columns containing individual components.
inplace
If True, delete the given column with dirty data. Else, create a new
column with cleaned data.
report
If True, output the summary report. Else, no report is outputted.
errors {'ignore', 'raise', 'coerce'}, default 'coerce'.
* If 'raise', then invalid parsing will raise an exception.
* If 'coerce', then invalid parsing will be set as NaN.
* If 'ignore', then invalid parsing will return the input.
"""
Design-level Explanation
def validate_json(x: Union[str, pd.Series]) -> Union[bool, pd.Series]:
"""
Function to validate JSON format.
Parameters
----------
x
String or Pandas Series of JSON to be validated.
"""
Implementation-level Explanation
Rational and Alternatives
Prior Art
JSON Python Library can encode and decode preserve input and output order by default.
Future Possibilities
Implementation-level Actions
Additional Tasks
[x] This task is put into a correct pipeline (Development Backlog or In Progress).
Summary
Implement clean_json() functionality to parse and clean JSON file
Design-level Explanation
[x] Discuss output formats to be supported.
Design-level Explanation Actions
Design-level Explanation
def validate_json(x: Union[str, pd.Series]) -> Union[bool, pd.Series]: """ Function to validate JSON format.
Implementation-level Explanation
Rational and Alternatives
Prior Art
JSON Python Library can encode and decode preserve input and output order by default.
Future Possibilities
Implementation-level Actions
Additional Tasks