Closed greggh closed 5 years ago
did you find a solution?
Hi lanthaler/JsonLD folks.
What I think you are/were seeing here is a defect that appeared when Schema.org v3.5 was released. Under certain circumstances the JSONLD context document did not include CORS header (See public-json-ld-wg thread for details) in its response.
This problem was addressed in a fix I applied to the site April 5th.
If this still is a problem, can you raise an issue in the schemaorg/schemaorg repo.
I have the same extension and same code in two different website, one is work the other one is not work.
Monitoring the server logs for the current version I can see many calls from "lanthaler JsonLD" clients successfully requesting the context file without errors.
@gomonkey Do you have debug traces of the failing calls (preferably with http request/response values) that can help identify what might be your particular issue.
As a matter of interest, from the logs I can see client requests, to schema.org, from a single ip address requesting the context file again and again (often within milliseconds of its last request - one 5 second snapshot revealed 72 requests from a single AWS hosted ip). Has any thought been put towards the caching of such requests, potentially using http response headers such as Last-Modified: & Cache-Control: max-age.
This I believe would be good for the performance of clients using this processor and general traffic reduction (especially for cloud based systems paying for bytes transferred).
the debug traces is the same like @greggh
private function loadDocument($input)
{
if (false === is_string($input)) {
// Return as is - it has already been parsed
return $input;
}
$document = $this->documentLoader->loadDocument($input);
return $document->document;
}
$input is not false $document maybe is empty and is not return $document->document;
If I put die() before return, it is still printing error.
Something is happen here:
$document = $this->documentLoader->loadDocument($input);
@gomonkey I concur with your conclusion that the problem is somewhere in the code that results from $this->documentLoader->loadDocument($input);
.
I have no experience with the code in the JsonLD Processor, and my php is very rusty.
Unfortunately without details of the http request, including headers, and the http status & response returned, that results from that call, it will be exceedingly difficult to identify the cause.
I am interested that you say you have one instance operating correctly, and one that fails. What is the difference between them - network, hosting, caching architecture, firewalls, etc.
Everything is the same (code, server,etc.), just change customer and products. The code where is work is just pasted from where today is not work. But I am going to check, can be some empty variable?
Without an understanding of the code I am not able to predict.
All we should get at schema.org is a http request which contains this header: Accept: application/ld+json
Perhaps someone with an understanding of the low-level php code could help with your diagnosis.
Hey @RichardWallis, thanks for showing up here! I just retried the code, and it's still doing the same thing. I admit it is good timing with the change/fix on the 5th, but it looks like that didn't solve it.
Did you recently change the http / https functionality? http://schema.org now is a hard 301. Was it always? That seems to be where the code is dying, right after that 301.
@RichardWallis it looks like I am getting 2 response's from the server, and both are text/html, including the one that should be json. At the very least that second one for the .json file should be one of: application/ld+json, application/json. That is where the code is dying.
$http_response_header;
array (
0 => 'HTTP/1.0 301 Moved Permanently',
1 => 'Location: https://schema.org/',
2 => 'X-Cloud-Trace-Context: b35637c48325fd44f8a7e63de27a549b',
3 => 'Date: Mon, 08 Apr 2019 15:05:44 GMT',
4 => 'Content-Type: text/html',
5 => 'Server: Google Frontend',
6 => 'Content-Length: 0',
7 => 'HTTP/1.0 302 Found',
8 => 'Content-Type: text/html; charset=utf-8',
9 => 'Access-Control-Allow-Origin: *',
10 => 'Location: https://schema.org/docs/jsonldcontext.json',
11 => 'Vary: Accept, Accept-Encoding',
12 => 'X-Cloud-Trace-Context: 543787090044a453c6bd1eaafdb08e8a',
13 => 'Date: Mon, 08 Apr 2019 14:57:23 GMT',
14 => 'Server: Google Frontend',
15 => 'Content-Length: 0',
16 => 'Cache-Control: public, max-age=600',
17 => 'Age: 501',
18 => 'Alt-Svc: quic=":443"; ma=2592000; v="46,44,43,39"',
19 => 'HTTP/1.0 200 OK',
20 => 'Access-Control-Allow-Origin: *',
21 => 'Vary: Accept, Accept-Encoding',
22 => 'ETag: 6c732607a47aae095f1e5d2dcfd39846',
23 => 'Last-Modified: Mon, 08 Apr 2019 09:09:19 GMT',
24 => 'Content-Type: text/html; charset=utf-8',
25 => 'X-Cloud-Trace-Context: 614fc00a3b7b58f328a983ef7f384777',
26 => 'Date: Mon, 08 Apr 2019 14:57:23 GMT',
27 => 'Server: Google Frontend',
28 => 'Content-Length: 139274',
29 => 'Age: 502',
30 => 'Cache-Control: public, max-age=600',
31 => 'Alt-Svc: quic=":443"; ma=2592000; v="46,44,43,39"',
Because it's coming back as text/html, this is failing at line 104 in FileGetContentsLoader.
if ('application/ld+json' === $remoteDocument->mediaType) {
$remoteDocument->contextUrl = null;
} elseif (('application/json' !== $remoteDocument->mediaType) &&
(0 !== substr_compare($remoteDocument->mediaType, '+json', -5))) {
throw new JsonLdException(
JsonLdException::LOADING_DOCUMENT_FAILED,
'Invalid media type',
$remoteDocument->mediaType
);
}
@greggh The hard 301 redirect from http://schema.org to https://schema.org has been in the code since the last version (3.4 - released June 2018)
... your trace has just come in I'll check it out...
I don't have anymore problem and I didn't change code. What happened?
@greggh Thanks for the input it helped me track down an obscure issue - now fixed:
wallisr$ curl -v -s -L --header "Accept: application/ld+json" http://schema.org 1> /dev/null
...
...
< HTTP/1.1 301 Moved Permanently
< Location: https://schema.org/
...
...
* Issue another request to this URL: 'https://schema.org/'
...
...
< HTTP/2 302
...
< location: https://schema.org/docs/jsonldcontext.jsonld
...
...
* Issue another request to this URL: 'https://schema.org/docs/jsonldcontext.jsonld'
...
> GET /docs/jsonldcontext.jsonld HTTP/2
...
< content-type: application/ld+json; charset=utf-8
...
< content-length: 139274
Let me know if it is now OK for you.
~Richard
@gomonkey Looks like my fix worked
@RichardWallis so glad I could help, and immensely happy you showed up in the thread. Thanks for all the help!
Thanks @RichardWallis. Closing this issue now. Please re-open if there are still issues.
Code that has been working fine for a few months started breaking today. Any time I feed in a site with JsonLD to get checked for structured data. I get:
Message: Loading http://schema.org failed Code: loading remote context failed
I can visit schema.org just fine with a web browser, so it's not some sort of block in place.
It is also happening to other people using another library that relies on this one: https://github.com/jkphl/micrometa/issues/34
It's this section in Processor.php (vendor/ml/json-ld/Processor.php) that's throwing the error. In the processContext function.