halaxa / json-machine

Efficient, easy-to-use, and fast PHP JSON stream parser
Apache License 2.0
1.08k stars 65 forks source link

UnexpectedEndSyntaxErrorException #91

Closed TechOverflow closed 1 year ago

TechOverflow commented 1 year ago

I sometimes get this error reading valid json files. Any idea why this could be happening? If I load the file into any json validator, there is no error on ','. Sometimes the position is not ',', but another arbitrary part of the json file,

`PHP Fatal error: Uncaught JsonMachine\Exception\UnexpectedEndSyntaxErrorException: JSON string ended unexpectedly ',' At position 0. in /.../.../.../.../.../vendor/halaxa/json-machine/src/Parser.php:368 Stack trace:

0 /.../.../.../.../.../vendor/halaxa/json-machine/src/Parser.php(249): JsonMachine\Parser->error()

1 /.../.../.../.../.../.../load.php(92): JsonMachine\Parser->getIterator()`

TechOverflow commented 1 year ago

Similar fault below. Interestingly, when I re-run the script, everything seems fine.

PHP Fatal error:  Uncaught JsonMachine\Exception\SyntaxErrorException: Unexpected symbol '"PROBLEM"' At position 0. in /.../.../.../.../.../vendor/halaxa/json-machine/src/Parser.php:368
Stack trace:
#0 /.../.../.../.../.../vendor/halaxa/json-machine/src/Parser.php(118): JsonMachine\Parser->error()
#1 /.../.../.../.../.../.../load.php(92): JsonMachine\Parser->getIterator()

Json example:

[
   {
      "PROBLEM": "",
      "DATA": "",
      ...
   }
]

The json file is small and contains 143 items structured like this json example.

halaxa commented 1 year ago

Hi and thanks 👍 Any piece of code that could replicate that would be greatly appreciated. I can turn this issue into PR if necessary.

BeGood20 commented 1 year ago

Is there a solution to the issue above? With a large 1GB json file read from a url I seem to be getting the error quiet a lot:

`PHP Fatal error: Uncaught JsonMachine\Exception\UnexpectedEndSyntaxErrorException: JSON string ended unexpectedly '"AppealD' At position 0. in .../vendor/halaxa/json-machine/src/Parser.php:349 Stack trace:

0 .../vendor/halaxa/json-machine/src/Parser.php(254): JsonMachine\Parser->error('JSON string end...', '"AppealD', 'JsonMachine\Exc...')

1 mycronfile.php(58): JsonMachine\Parser->getIterator()

2 {main}

thrown in .../vendor/halaxa/json-machine/src/Parser.php on line 349`

halaxa commented 1 year ago

Hi, as I wrote above, a piece of failing code would help to debug this. Could you provide it please?

BeGood20 commented 1 year ago

Hi halaxa,

is there a way I can fix up the parsing for this? I think there may be a few issues within the array content itself that may need parsing. I am not sure if the & or ' or / symbol is allowed or even white spacing. A sample code is as follows with 3 row's added to the geoJSON file:

{
"type": "FeatureCollection",
"name": "Applications",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "OBJECTID": 105997, "Authority": "County Council", "ApplicationNumber": "FW19A/0158          ", "DevelopmentDescription": "Proposed 10 x 5M x 3.6M high canopy with supporting steel structure & ", "DevelopmentAddress": "Unit 2, Sorthwest Dusiness Park, Dallycoolin, Doodle 15", "DevelopmentPostcode": null, "ITMEasting": null, "ITMNorthing": null, "ApplicationStatus": "Decision made                                     ", "ApplicationType": "Permission                                        ", "ApplicantForename": "", "ApplicantSurname": "", "ApplicantAddress": "", "Decision": "GRANT PERMISSION                                  ", "LandUseCode": null, "AreaofSite": 15520.0, "NumResidentialUnits": null, "OneOffHouse": null, "FloorArea": null, "ReceivedDate": "2019-09-13T00:00:00Z", "WithdrawnDate": null, "DecisionDate": "2019-11-07T00:00:00Z", "DecisionDueDate": "2019-11-07T00:00:00Z", "GrantDate": "2019-12-19T00:00:00Z", "ExpiryDate": null, "AppealRefNumber": null, "AppealStatus": null, "AppealDecision": null, "AppealDecisionDate": null, "AppealSubmittedDate": null, "FIRequestDate": null, "FIRecDate": null, "LinkAppDetails": "https://planning.agileapplications.ie/fingal/appli", "OneOffKPI": null, "ETL_DATE": "2023-01-02T00:00:00Z", "SiteId": null, "ORIG_FID": 349260 }, "geometry": { "type": "Point", "coordinates": [ -6.546507088999942, 63.411123370000041 ] } },
{ "type": "Feature", "properties": { "OBJECTID": 105998, "Authority": "County Council", "ApplicationNumber": "F13B/0135           ", "DevelopmentDescription": "Construction of a new ground floor, single storey extension (circa 23.", "DevelopmentAddress": "2 Bshdale Close, Dinsealy, Doodle 1", "DevelopmentPostcode": null, "ITMEasting": null, "ITMNorthing": null, "ApplicationStatus": "Decision made                                     ", "ApplicationType": "Permission                                        ", "ApplicantForename": "", "ApplicantSurname": "", "ApplicantAddress": "", "Decision": "GRANT PERMISSION                                  ", "LandUseCode": null, "AreaofSite": 300.0, "NumResidentialUnits": null, "OneOffHouse": null, "FloorArea": null, "ReceivedDate": "2013-09-06T00:00:00Z", "WithdrawnDate": null, "DecisionDate": "2013-10-24T00:00:00Z", "DecisionDueDate": "2013-10-31T00:00:00Z", "GrantDate": "2013-12-02T00:00:00Z", "ExpiryDate": null, "AppealRefNumber": null, "AppealStatus": null, "AppealDecision": null, "AppealDecisionDate": null, "AppealSubmittedDate": null, "FIRequestDate": null, "FIRecDate": null, "LinkAppDetails": "https://planning.agileapplications.ie/fingal/appli", "OneOffKPI": null, "ETL_DATE": "2023-01-02T00:00:00Z", "SiteId": null, "ORIG_FID": 349261 }, "geometry": { "type": "Point", "coordinates": [ -6.896988891999979, 73.445792320000066 ] } },
{ "type": "Feature", "properties": { "OBJECTID": 111493, "Authority": "County Council", "ApplicationNumber": "FW15A/0114          ", "DevelopmentDescription": "A mixed use residential and commercial development comprising a total ", "DevelopmentAddress": "'Duinch' & adjoining lands, Dollystown, Dollywoodrath, Doodle 15", "DevelopmentPostcode": null, "ITMEasting": null, "ITMNorthing": null, "ApplicationStatus": "Appeal decided                                    ", "ApplicationType": "Permission                                        ", "ApplicantForename": "", "ApplicantSurname": "", "ApplicantAddress": "", "Decision": "REFUSE PERMISSION                                 ", "LandUseCode": null, "AreaofSite": 32392.0, "NumResidentialUnits": null, "OneOffHouse": null, "FloorArea": null, "ReceivedDate": "2016-02-05T00:00:00Z", "WithdrawnDate": null, "DecisionDate": "2016-03-03T00:00:00Z", "DecisionDueDate": "2016-03-03T00:00:00Z", "GrantDate": null, "ExpiryDate": null, "AppealRefNumber": "PL06F. 246379       ", "AppealStatus": "Granted                                           ", "AppealDecision": "Refuse Permission                                 ", "AppealDecisionDate": "2016-08-02T00:00:00Z", "AppealSubmittedDate": "2016-03-30T00:00:00Z", "FIRequestDate": "2016-01-28T00:00:00Z", "FIRecDate": "2016-02-05T00:00:00Z", "LinkAppDetails": "https://planning.agileapplications.ie/fingal/appli", "OneOffKPI": null, "ETL_DATE": "2023-01-02T00:00:00Z", "SiteId": null, "ORIG_FID": 354756 }, "geometry": { "type": "Point", "coordinates": [ -6.675444867999931, 73.427724581000064 ] } }
]
}

The geoJSON is being read via an opendata url. Latest error received is:

`PHP Fatal error: Uncaught JsonMachine\Exception\UnexpectedEndSyntaxErrorException: JSON string ended unexpectedly '"A mixed use residential and commercial development comprising a total' At position 0. in /vendor/halaxa/json-machine/src/Parser.php:349 Stack trace:

0 /vendor/halaxa/json-machine/src/Parser.php(254): JsonMachine\Parser->error('JSON string end...', '"A mixed use re...', 'JsonMachine\Exc...')

1 /myscript.php(58): JsonMachine\Parser->getIterator()

2 {main}

thrown in /vendor/halaxa/json-machine/src/Parser.php on line 349`

If needed, I can provide the json url privately. Please let me know how I can do that.

halaxa commented 1 year ago

Thanks. Please add the php code snippet you're trying to iterate it with as well. I guess the json pointer is /features but that gives me no error.

BeGood20 commented 1 year ago

Hi,

There are 420k+ rows required to read from a 1GB url file which will be added to the db. The reason I have a starting point (StartingRow) below is so the script can start up again at the last point before it got cut off. The error keeps getting thrown for OBJECTID (105997) and won't add anything after that.

$filename = 'https://url';
$file_loaded = Items::fromFile($filename, ['pointer' => '/features']);
// db code to get db results for $result3
foreach ($result3 as $DBresult) {
    $AmountOfRow = $DBresult['AmountOfRows'];
    if ($DBresult['StartingRow'] <= $AmountOfRow) {
        $rowsPerRun = 2000;
        if ($DBresult['StartingRow'] != 0) {
            $StartingRow = $DBresult['StartingRow'] + 1;
        } else {
            $StartingRow = $DBresult['StartingRow'];
        }
        for ($i = $StartingRow; $i <= $AmountOfRow; $i = $i+$rowsPerRun+1) {
            foreach ($file_loaded as $features => $data) {
                if (isset($data->properties->OBJECTID) && $data->properties->OBJECTID >= $StartingRow && $data->properties->OBJECTID < $AmountOfRow+1) {
                    if (isset($data->properties->OBJECTID)) {
                        $Authority = trim($data->properties->Authority);
                        // rest of code
                    }
                }
            }
        }
    }
}
halaxa commented 1 year ago

Thanks. This setup gives me no error. Just 2 follow-ups: Do you get the error every time on this chunk of JSON? And do you use the latest JM release?

BeGood20 commented 1 year ago

I get the error every time on the full json file. Can I private message you the url to this file?

halaxa commented 1 year ago

Before we continue, please confirm you have version 1.1.3 installed.

BeGood20 commented 1 year ago

Yes, the CHANGELOG.md shows:

## 1.1.3 - 2022-10-12

BeGood20 commented 1 year ago

Any progress with this?

halaxa commented 1 year ago

I'm on holiday. I'll get to it when I get back.

halaxa commented 1 year ago

Please send me the url to filip@halaxa.cz. Thanks.

halaxa commented 1 year ago

@BeGood20 provided the file privately.

Iterating the whole thing throws no error on my side. 2 things come to mind.

halaxa commented 1 year ago

Any news on this?

BeGood20 commented 1 year ago

@halaxa it was a memory issue but it is very slow due to the size of the file as it is adding 1 record per 1 second. I might need to convert this to a CSV file to make it faster, which is my next idea.

halaxa commented 1 year ago

It depends on how big your single item is. But one per second seems untenably slow anyway. I can process about 15MB/s of JSON on my average windows desktop inside WSL2. If you import the items to a database try optimizing your queries (like https://dev.mysql.com/doc/refman/8.0/en/insert-optimization.html or wrapping multiple inserts into a transaction). Also, make sure xdebug is disabled in production.

halaxa commented 1 year ago

Closing as resolved for now.