jointakahe / takahe

An ActivityPub/Fediverse server
BSD 3-Clause "New" or "Revised" License
1.1k stars 84 forks source link

deal with unknown json-ld schema #644

Closed osmaa closed 10 months ago

osmaa commented 10 months ago

ref #640 this implements a remote schema loader with a tiny in-memory cache for not-built-in contexts, and also returns an empty schema when one can't be loaded (or is not valid json) from the remote server. This entirely eliminated the largest source of errors in my sentry monitoring even without #643.

Introduces a new dependency with httpx-cache. A simpler version of this fix would be to just replace the JsonLdError in builtin_document_loader with a return {}, since nothing depends on not canonicalizing schema which hasn't been built-in.

andrewgodwin commented 10 months ago

Yeah, to be honest I'd rather go with the return {}, since ActivityStreams specifically says that messages should adhere to an exact main context and that you don't need a full JSON-LD parser to decode them; the less network requests we can do for things we're not going to use, the better.

osmaa commented 10 months ago

yeah, I haven't noticed any benefit to pulling these remote schemas - quite the contrary, many AP messages apparently have context references to 404 URLs.

andrewgodwin commented 10 months ago

Yes, I suspect that some of them are just used as flags for internal parsers or similar. Want to flip this PR to just return empty context, or open a new one for that?

osmaa commented 10 months ago

context loader and cache reverted, blank schema added to the builtins, returning that instead of raising exception

andrewgodwin commented 10 months ago

Lovely, thanks for that!