digitalbazaar / jsonld.js

A JSON-LD Processor and API implementation in JavaScript
https://json-ld.org/
Other
1.66k stars 195 forks source link

compatibility with browserify #41

Closed elf-pavlik closed 10 years ago

elf-pavlik commented 10 years ago

seams like at present we can't use this package with browserify :cry:

more details in https://github.com/mcollina/levelgraph-jsonld/issues/3#issuecomment-26105714

@mcollina also suggests some possible solutions on thread mentioned above :smile:

dlongley commented 10 years ago

cc: @davidlehn

elf-pavlik commented 10 years ago

@davidlehn i plan to work in next weeks with JSON-LD and RDFa in browser environment... will try to investigate this issue as well!

mcollina commented 10 years ago

:+1:

davidlehn commented 10 years ago

The quick fix of omitting jsdom and rdfa support works at a basic level. Of course if you do that then you lose RDFa support in the jsonld.js "request" extension.

Browserify is being aggressive with the jsonld.js code. It pulls in every dependency it sees. Even without jsdom and rdfa you end up with 334k minified js! I'm guessing much of that is not needed. If the environment detection support works after browserification, the code could just use the document loader XHR browser support rather than all the libraries needed for node support.

The jsonld.js "request" extension is just node only right now. I think a similar version optimized for browsers could be made. It would use XHR to avoid the "request" library and all its dependencies. It could probably use native browser DOM features rather than bothering with jsdom. And the stdin support is not really that useful in a browser? The RDFa parsing library would be used by both node and browser versions.

I think the way forward is to start by making sure environment detection is working in node, with regular browser use, and with browserify use. Then block browserify from using node libraries like http, request, jsdom, etc and just depend on native browser features. Then refactor the "request" extension to adapt to the environment and use XHR and native DOM rather than request and jsdom libraries.

That should let jsonld.js and the request extension work in all three environments and be a bit optimized too.

Thoughts?

mcollina commented 10 years ago

Browserify is being aggressive with the jsonld.js code. It pulls in every dependency it sees. Even without jsdom and rdfa you end up with 334k minified js! I'm guessing much of that is not needed. If the environment detection support works after browserification, the code could just use the document loader XHR browser support rather than all the libraries needed for node support.

That's the point of browserify, it is aggressive so every node-only module works out-of-the-box, but the fact that jsonld.js is both node and browser confuses browserify.

The jsonld.js "request" extension is just node only right now. I think a similar version optimized for browsers could be made. It would use XHR to avoid the "request" library and all its dependencies. It could probably use native browser DOM features rather than bothering with jsdom. And the stdin support is not really that useful in a browser? The RDFa parsing library would be used by both node and browser versions.

Exactly.

Another solution is to better modularize the code, and do a 'release' for the browser by creating a browserify bundle. This allows to better modularize the code into multiple files, and then to use browserify 'browser' section to provide the browser variants.

Anyway, :+1: for better node, browser and browserify support :).

elf-pavlik commented 10 years ago

thank you @davidlehn for following up on this issue!

IMO one should have possibility to use jsonld.js both in browser and in node without using anything related to RDFa, Turtle, RDF/XML or Microdata. Especially when used as dependency in their projects which may already handle other serializations using some other tools.

I'll try to take a closer look at how jsonld.js currently work sometimes this/next week. Do you have by any chance some kind of conceptual diagram, like for example one at: http://viejs.org ? I hope to come up with some suggestions on more modular setup so everyone can use only what they really need...

I didn't look yet on how you extend request, but I've seen people using various alternatives like:

So I would like to take a look at possibility of staying agnostic about how one handles requests :)

mcollina commented 10 years ago

I'm :+1: for using superagent as it works wonderfully both on node and on the browser.

A modular setup might be very welcome. Doing separate projects will also help adoption.

davidlehn commented 10 years ago

@mcollina Yeah, I understand why browserify is pulling in everything, but it seems rather heavyweight depending on what you need. I'm pretty sure the environment detection part is needed no matter what. But if you were just using the basic JSON-LD processing, there is little need for the node and "request" extension support. If you want to do things like process RDFa, then at least part of the "request" extension is needed including something to use browser DOM vs jsdom. If you are using more specialized things like passing "request" lib params to the "request" extension, then you likely need to pull in the kitchen sink, or much of it. I have no idea how to make all these options easy to use. The solution will probably involve splitting up functionality into files that can be included, excluded, or substituted as needed.

@elf-pavlik The support to handle other formats is all in the "request" extension right now. It really just handles RDFa and JSON but the theory is it could be expanded with more type negotiation and additional parsers. We don't have diagrams yet, sorry. Until now it hasn't been complex enough to need diagrams.

I know there are lots of other HTTP request libraries. At the time, one reason I choose to use "request" because it was the easiest to add support for the draft HTTP Signature spec via the related node-http-signature lib. Adding an abstraction layer was overkill. I did try to make everything work such that request lib params are passed via a specific named options object but I think the return value may be request lib specific. Making a generic one-request-API-to-rule-them-all layer one be better done in another project. I'd rather not tackle that sort of problem myself but would be happy to use it. :-)

elf-pavlik commented 10 years ago

@davidlehn i'll do my best to read through all the code this weekend or early next week and then start working on a pull request!

would you see possibility of shifting jsonld.js focus exclusively on JSON-LD and create dedicated project for more of a high level tool which mixes in RDFa, HTTP Signature etc? i could also check what @webr3 has in mind with https://github.com/webr3/rdf.js

i assume that whatever we do we can't break https://github.com/digitalbazaar/payswarm.js ;) BTW wouldn't HTTP Signatures fit there better? (blind shooting ATM, please feel free to ignore!)

luckily it looks like this package still doesn't have that many dependents: https://npmjs.org/package/jsonld

davidlehn commented 10 years ago

If it makes sense to split out things like the "request" extension into a new package, that's fine. It was just easier to add to the main package at the time. That won't fix all the problems, and just moves some of them to the new package, but might help for some use cases.

I haven't thought about the HTTP Signature issue too much. It needs the method, path, and get/set header ability and maybe to deal with some gotchas with things like the Date header. The current system using the "request" lib is pretty nice since you just pass in your signing info for any request and the lib figures it out for you.

There are more users of the package than just the npm dependents list. It would certainly be nice not to break payswarm.js too much but that code can be updated as needed.

elf-pavlik commented 10 years ago

without diving into code i guess jsonld.js needs to make requests to fetch external contexts? in a browser it might get tricky if someone doesn't http://enable-cors.org , maybe yet another reason to leave it to dependent package to provide mechanism for making requests... one could then deal with CORS issues by using JSONP, falling back to proxy or whatever fits given case!

pulling out RDFa sounds even more attractive to me and possibly more straight forward...

i will ask folks participating in http://www.w3.org/community/rdfjs for their feedback and maybe I better read the code before discussing it further :book:

thank you @davidlehn for all the clarifications, same @mcollina for all the creative suggestions :+1:

davidlehn commented 10 years ago

Yes, the main file has XHR and node support for loading contexts. In theory, a browserified file could just use the XHR support too rather than the node support. Which is why some experimentation is needed with the environment detection. The "request" extension is headed in the direction of just giving it a URL and it figures out type negotiation, parsing, and extraction details to give you back the JSON-LD for that resource. More work is needed but that was the theory.

dlongley commented 10 years ago

Issues #29 and #31 should be looked over when considering any updates to how requests work. After the "request" extension was written, better support was added for external document loading to the main jsonld.js file. This support has not yet been integrated into (or used to replace parts of) the "request" extension. The jsonld.js external document loader API should probably be used in any rewrite/fixes.

elf-pavlik commented 10 years ago

As first step I would like to pull out all the RDFa related code somewhere else with only necessary changes to the way of handling requests. What do you think about creating another repository which would use this one as dependency and recreating there all the functionality provided currently here by this JSON-LD + RDFa love affair? :wink:

elf-pavlik commented 10 years ago

I see you include xmldom to handle rdf:XMLLiteral (http://www.w3.org/TR/2014/PR-rdf11-concepts-20140109/#section-XMLLiteral) Searching json-ld and json-ld-api specs for XMLLiteral gives no results, same with public-linked-json archives Do I need to keep anything specific in mind when it comes to rdf:XMLLiteral in JSON-LD ?

dlongley commented 10 years ago

@elf-pavlik, No, not with respect to JSON-LD. That lib is only included as part of the RDFa => JSON-LD conversion code, IIRC.

davidlehn commented 10 years ago

A continuation of discussion on the rdfjs list: http://lists.w3.org/Archives/Public/public-rdfjs/2014Jan/0014.html

I'd rather not see jsonld.js split up at all. It's another thing to manage, and I'm not convinced we can't just solve the browserify issues here.

@dlongley will have to comment on all this, but I think some of the goals of jsonld.js were:

We had some payswarm code that was basically what is in the request extension. I figured that was of general use and moved it into this library, with an extension method. The idea being you didn't need to include it. I also needed a debug tool and the ./bin/jsonld code came out of that. It uses the main lib and extension and lets use do basic JSON-LD things on local files, stdin, and web resources and handles a variety of input formats. Works fairly well too.

Putting the library directly as a script in a browser works just fine. It just ignores the node.js code. But browserify goes to great lengths to pull in the kitchen sink as if the lib was running node. I get that this is the point of browserify, but it certainly seems inefficient in this case. It would be nice to have a simple version that didn't depend on 'request' and pull in everything it needs. A less simple version could use XHR or jQuery to grab resources rather than digging into 'request'. And if you really need it, you could use the full version.

I think this is all possible with some refactoring. However, it would likely mean making the code more directed at a node.js style and require the use of a browserify step and shims to get a version for browsers. That is an extra hassle for quick use but maybe it's not a big deal?

Here's how that would go down:

Pros:

Cons:

I'm thinking that just splitting the code will still require some of this sort of shim work anyway to avoid extra dependencies and bloat from browserify, but I could be wrong. There are probably other approaches but this seems like a workable one to some extent.

Comments?

dlongley commented 10 years ago

Could this be of use?

https://npmjs.org/package/ignorify

dlongley commented 10 years ago

Could also maybe just use the "noParse" option in browserify.

elf-pavlik commented 10 years ago

@davidlehn thank you for such in depth reply! I'll take some time to give a thought to issues related to loading remote context and share my reply then. Before that I would like to understand why you bundle RDFa into this library? When I require('jsonld') only in some cases I may want to also use RDFa, and even then I would prefer to simply require('rdfa') in addition, just like I can also require('n3') etc.

mcollina commented 10 years ago

Let's try a softer approach, like detecting we are on browserify, and ignore the fact that we have a node-like environment. In browserify process.title === 'browser' while on node process.title === 'node' (Thanks to this line https://github.com/defunctzombie/node-process/blob/master/browser.js#L40). Then we can use the 'browser' field in package.json to skip all the node libraries, see https://gist.github.com/defunctzombie/4339901.

It seems a little crazy, but it should work with minimum overhead (the process module) that will be merged in anyway on browserify. Does it makes sense?

dlongley commented 10 years ago

@mcollina, sure, if that works then it sounds like the simplest solution.

nopnop commented 10 years ago

@elf-pavlik @dlongley @mcollina You may have a look to #56 to that resolve this issue using package.json "browser" and "browserify" keys.

This patch could be acceptable as a temporary measure. A long-term solution may be a refactoring of js/jsonld.js: removing every reference to node/amd/browser detection, splitting in smaller modules and use a building process to provide browser compatible standalone source (using browserify --standalone option, or umd tool).

mcollina commented 10 years ago

Great!!