XMLTV / xmltv

Utilities to obtain, generate, and post-process TV listings data in XMLTV format
GNU General Public License v2.0
300 stars 94 forks source link

malformed JSON string at /usr/bin/tv_grab_zz_sdjson line 278 #169

Closed joeczucha closed 2 years ago

joeczucha commented 2 years ago

Thanks for taking the time to report an issue. Please take a moment to review our open/closed issues above, in case your issue has already been reported.

If you are reporting a new issue, please give your issue a descriptive title and fill out the blanks below, providing as much information as possible.

XMLTV Version?

XMLTV module version 1.0.0

XMLTV Component?

This is tv_grab_zz_sdjson version 1.0.0

What happened?

Firstly, apologies if this is in the wrong place. I've been using TVHeadEnd which makes use of tv_grab_zz_sdjson and I'm still getting my head around how it all pieces together.

It's been working perfectly until now, for the past year or so. However, from a couple of months ago, I started getting an error message (see below).

What's strange is that the error seems to happen at different times, which suggests some kind of timeout rather than a straight failure.

What did you expect to happen?

Schedules to download, as before

Did you see any warnings/errors?

** POST https://json.schedulesdirect.org/20141201/schedules/md5 ==> 200 OK (8s)
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/bin/tv_grab_zz_sdjson line 278.

I also get this error, but I'm not sure if it's relevant as it happens earlier:

Magic number checking on storable file failed at /usr/lib/perl5/core_perl/Storable.pm line 421, at /usr/bin/tv_grab_zz_sdjson line 693.

What steps are needed to reproduce this issue?

1) Log into Docker container: docker exec -it tvheadend /bin/bash 2) Run grabber: tv_grab_zz_sdjson --days 0 > list.xml

Any other information?

i tried using tv_grab_zz_sdjson_sqlite --days 0 > list.xml (after configuring it) and it seems to complete, without issue.

For posterity, here is the entire log of the output:

$ tv_grab_zz_sdjson --days 0  > list.xml
Initializing...
Magic number checking on storable file failed at /usr/lib/perl5/core_perl/Storable.pm line 421, at /usr/bin/tv_grab_zz_sdjson line 693.
** POST https://json.schedulesdirect.org/20141201/token ==> 200 OK (1s)
** GET https://json.schedulesdirect.org/20141201/status ==> 200 OK (1s)
Updating lineups...
** GET https://json.schedulesdirect.org/20141201/lineups/GBR-1000193-DEFAULT ==> 200 OK (1s)
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/bin/tv_grab_zz_sdjson line 278.
Retry in 10 seconds...
** GET https://json.schedulesdirect.org/20141201/lineups/GBR-1000193-DEFAULT ==> 200 OK (1s)
** GET https://json.schedulesdirect.org/20141201/lineups/IRL-1000448-DEFAULT ==> 200 OK (1s)
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/bin/tv_grab_zz_sdjson line 278.
Retry in 10 seconds...
** GET https://json.schedulesdirect.org/20141201/lineups/IRL-1000448-DEFAULT ==> 200 OK
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/bin/tv_grab_zz_sdjson line 278.
Retry in 10 seconds...
** GET https://json.schedulesdirect.org/20141201/lineups/IRL-1000448-DEFAULT ==> 200 OK (1s)
** GET https://json.schedulesdirect.org/20141201/lineups/USA-SLINGTV-DEFAULT ==> 200 OK (1s)
Indexing channels...
Updating schedules...
** POST https://json.schedulesdirect.org/20141201/schedules/md5 ==> 200 OK (30s)
** POST https://json.schedulesdirect.org/20141201/schedules ==> 200 OK (15s)
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/bin/tv_grab_zz_sdjson line 278.
$

What other software are you using?

OS: Docker container running on a Synology NAS. Container kernel: 4.4.180+

Perl: v5.34.0

I hope that's enough to go on, any advice greatly apprecaited.

garybuhrmaster commented 2 years ago

First, you are in at least an appropriate place for reporting the issue (some people prefer other venues, but this one works, too).

Secondly, from your log, it appears that the grabber did retry (at it should) when it got the bad response and then later got a valid (or at least not an error message) response. At least initially until the point of getting schedules (where it did not retry?)

A 1s response time normally would not indicate a timeout (when others have seen true network timeouts the numbers where in minutes)..

Unfortunately, with a simple test case (using similar lineups) I can't reproduce getting malformed results with tv_grab_zz_sdjson.

Since the source data that tv_grab_zz_sdjson and tv_grab_zz_sdjson_sqlite retrieve is (or should be) essentially the same (although the implementation, including validation and retries and amount of data retrieved at once, is substantially different) it would suggest your setup (at Schedules Direct) is correct.

If this is (semi) reproducable, you can try adding something like:

print STDERR Dumper($response->decoded_content());

right before line 278 (the line that decodes the response) and report back what you see.

If there is no valid response content at all, perhaps instead try:

print STDERR Dumper($response);

(which, unfortunately, will print out the binary data, but there is also additional info in there from the response).

joeczucha commented 2 years ago

First, you are in at least an appropriate place for reporting the issue (some people prefer other venues, but this one works, too).

Great, thanks for clarifying and for getting back to me so quickly.

A 1s response time normally would not indicate a timeout (when others have seen true network timeouts the numbers where in minutes)..

Agreed, it seems like the response is empty. More on that shortly...

If this is (semi) reproducable, you can try adding something like: print STDERR Dumper($response->decoded_content());

This works for the token

** POST https://json.schedulesdirect.org/20141201/token ==> 200 OK
$VAR1 = '{"code":0,"message":"OK","serverID":"20141201.web","datetime":"2022-03-24T18:22:02Z","token":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}';

But is empty after that

** GET https://json.schedulesdirect.org/20141201/lineups/GBR-1000193-DEFAULT ==> 200 OK (1s)
$VAR1 = undef;

If there is no valid response content at all, perhaps instead try: print STDERR Dumper($response);

As you warned, this threw out a whole load of binary. I won't include it here, but just before the failure I get the following (hopefully it's enough to go on)

;G���w{����6��)]�{y����t����xۛH����<��I�V�v/ikvikv?ik�v�ik^��UY;��7x��r}
                                                                    ���눅!�D�U]~I�L',
                '_rc' => 200,
                '_msg' => 'OK',
                '_protocol' => 'HTTP/1.1',
                '_headers' => bless( {
                                        '::std_case' => {
                                                        'client-ssl-cert-subject' => 'Client-SSL-Cert-Subject',
                                                        'schedulesdirect-serverid' => 'Schedulesdirect-Serverid',
                                                        'client-response-num' => 'Client-Response-Num',
                                                        'client-date' => 'Client-Date',
                                                        'x-died' => 'X-Died',
                                                        'client-ssl-cert-issuer' => 'Client-SSL-Cert-Issuer',
                                                        'x-frame-options' => 'X-Frame-Options',
                                                        'client-transfer-encoding' => 'Client-Transfer-Encoding',
                                                        'client-ssl-cipher' => 'Client-SSL-Cipher',
                                                        'x-content-type-options' => 'X-Content-Type-Options',
                                                        'client-ssl-version' => 'Client-SSL-Version',
                                                        'x-xss-protection' => 'X-XSS-Protection',
                                                        'client-peer' => 'Client-Peer',
                                                        'client-ssl-socket-class' => 'Client-SSL-Socket-Class',
                                                        'client-aborted' => 'Client-Aborted'
                                                        },
                                        'date' => 'Thu, 24 Mar 2022 18:27:38 GMT',
                                        'connection' => 'close',
                                        'client-aborted' => 'die',
                                        'content-type' => 'application/json;charset=UTF-8',
                                        'client-transfer-encoding' => [
                                                                        'chunked'
                                                                    ],
                                        'client-ssl-cipher' => 'ECDHE-RSA-AES128-GCM-SHA256',
                                        'x-content-type-options' => 'nosniff',
                                        'schedulesdirect-serverid' => '20141201.web',
                                        'client-date' => 'Thu, 24 Mar 2022 18:27:38 GMT',
                                        'server' => 'nginx/1.18.0 (Ubuntu)',
                                        'client-ssl-cert-issuer' => '/C=US/O=Amazon/OU=Server CA 1B/CN=Amazon',
                                        'content-encoding' => 'gzip',
                                        'vary' => 'Accept-Encoding',
                                        'client-peer' => '34.224.228.247:443',
                                        'client-ssl-socket-class' => 'IO::Socket::SSL',
                                        'x-xss-protection' => '1; mode=block',
                                        'client-ssl-version' => 'TLSv1_2',
                                        'etag' => 'W/"XHi8KWlSwAySHrkCkgjPfA"',
                                        'x-frame-options' => 'SAMEORIGIN',
                                        'client-response-num' => 1,
                                        'x-died' => 'read failed: Invalid argument at /usr/share/perl5/vendor_perl/LWP/Protocol/http.pm line 471.',
                                        'client-ssl-cert-subject' => '/CN=*.json.schedulesdirect.org'
                                    }, 'HTTP::Headers' ),
                '_request' => bless( {
                                        '_uri_canonical' => bless( do{\(my $o = 'https://json.schedulesdirect.org/20141201/lineups/GBR-1000193-DEFAULT')}, 'URI::https' ),
                                        '_method' => 'GET',
                                        '_headers' => bless( {
                                                            'user-agent' => 'tv_grab_zz_sdjson 1.0.0',
                                                            'accept-encoding' => 'gzip, x-gzip, deflate, x-bzip2, bzip2',
                                                            'content-type' => 'application/json',
                                                            '::std_case' => {
                                                                                'if-ssl-cert-subject' => 'If-SSL-Cert-Subject',
                                                                                'token' => 'Token'
                                                                            },
                                                            'token' => 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
                                                            }, 'HTTP::Headers' ),
                                        '_uri' => $VAR1->{'_request'}{'_uri_canonical'},
                                        '_content' => ''
                                    }, 'HTTP::Request' )
            }, 'HTTP::Response' );
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/bin/tv_grab_zz_sdjson line 279.
Retry in 10 seconds...

Thanks again for your advice so far!

garybuhrmaster commented 2 years ago

Thanks again for your advice so far!

Pretty sure the important response line is:

'x-died' => 'read failed: Invalid argument at /usr/share/perl5/vendor_perl/LWP/Protocol/[http.pm](http://http.pm/) line 471.',

Unfortunately, "Invalid Argument" is not exactly sufficient ($! in perl contains "Invalid argument", but the more complete error would (might) be in $@).

You can try to add in a print .... dumper of $@, although I would not be surprised if the entire call stack pop may have replaced that value by the time of the return to the grabber itself, so it might not contain anything useful.

You can also try modifying your local copy of LWP/Protocol/http.pm to add in $@ (after the $!) to the die (at line 471) to see if the additional info is useful.

And, it all else fails (and mostly as a "random" try anything test), you could try to rebuild your docker container to see if it can pull in any new(er) updates that might change things.

btw, you said it started about a month ago. Did you update something about then?

joeczucha commented 2 years ago

I've just stumbled across the above issue on the docker image repo so I'll close this now, as it seems to be related to a rebase that they did in January.

Thanks again for your help.

garybuhrmaster commented 2 years ago

Thanks again for your help.

And thanks for the followup and closing of this issue.