pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.82k stars 17.99k forks source link

BUG: is a child element contains NULL this results in a TypeError: 'NoneType' object is not iterable #53719

Open maarten-van-wauwe opened 1 year ago

maarten-van-wauwe commented 1 year ago

Pandas version checks

Reproducible Example

import pandas as pd
import json

json_text = """
    {"root": [{
    "id":1,
    "field1":"apple",
     "field2":"banana",
     "siblings":[
        {"sibling_id": "1",
         "sibling1": "horse",
         "sibling2": "ape",
         "subsiblings": [{"a": "b"}]
         }]
     },
     {
     "id":2,
     "field1":"cola",
     "field2":"dwarf",
     "siblings":null
     }]
     }
"""

json_object = json.loads(json_text)

print(pd.json_normalize(json_object, record_path=['root']))
print(pd.json_normalize(json_object, record_path=['root','siblings'], meta=[['root','id']]))
print(pd.json_normalize(json_object, record_path=['root','siblings','subsiblings'], meta=[['root','siblings','sibling_id']]))

Issue Description

element with id = 2 has NULL in siblings field. This results in a TypeError: 'NoneType' object is not iterable

Solution: add following 2 lines to the definition of the "_recursive_extract" function

def _recursive_extract(data, path, seen_meta, level=0):
    if data is None:
        return

Expected Behavior

null elements should be skipped for further recursive processing

Installed Versions

this failed

maarten-van-wauwe commented 1 year ago

this happened with several API's that returned me JSON. eg: Skype json that you can download as backup of conversations

punndcoder28 commented 1 year ago

Hi! I can work on this issue if it is not already being worked on