Open NguyenDrasp opened 1 year ago
This issue comes from LN1139 of facebook_scraper/extractors.py
that loads the response text directly into JSON, but as it happens, sometimes there are two JSON objects inside the response without it being wrapped in an array.
I made a hot fix for the issue by restructuring the json string to be wrapped in an array in cases where there are multiple json objects in an invalid format.
Line 1138
json_str = response.text[prefix_length:].strip() # Strip 'for (;;);'
if "}{" in json_str:
# multiple json objs can come without being wrapped in an array
json_str = f"[{json_str.replace('}{', '},{')}]"
data = json.loads(json_str)
if isinstance(data, list):
for i, subdata in enumerate(data):
if i == 0:
continue
data[0]['payload']['actions'].extend(subdata['payload']['actions'])
data = data[0]
Line 1159
i.e.
It would be helpful if you rename your issue as "JSONDecodeError "extra data" in extract_comment_replies" @NguyenDrasp
in extract_comment_replies data = json.loads(response.text[prefix_length:]) # Strip 'for (;;);' File "/usr/lib/python3.10/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/usr/lib/python3.10/json/decoder.py", line 340, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 1 column 30442 (char 30441)