samvera-labs / ldpath

Ruby parser for the LDPath language
http://marmotta.apache.org/ldpath/language.html
Other
9 stars 5 forks source link

can ldpath depend on less than the full linkeddata gem? #19

Closed jrochkind closed 4 years ago

jrochkind commented 4 years ago

The linkeddata gem is "A metadistribution of RDF.rb including all parsing/serialization plugins."

When ldpath depends on the entire linkeddata gem, it means that ALL parsing/serialization plugins and other parts of the RDF.rb ecosystem are pulled into any app that uses ldpath, even if ldpath doesn't use them all, or the app doesn't use them all.

This increases install time and size of any app that uses ldpath. It also potentially introduces dependency unnecessary requirement conflicts -- maybe I want to use a version of some gem that conflicts with the dependency tree of ldpath, even though ldpath isn't actually using that functionality that causes the conflict.

This came up when noticing that linkeddata was restricting nokogumbo to 1.x when a 2.0 was released. And my own app wound up with that restriction via qa => ldpath => linkeddata.

But if qa and my app aren't actually even using nokogumbo at all... it's unfortunately that this even became an issue, or that my app needs to spend time and space installing nokogumbo.

@no-reply mentioned that gems depending on linkeddata should perhaps be instead depending on the particular component parts they actually need instead, to not force downstream apps to install all of linkeddata. (At first I thought it was qa depending on linkeddata, but I got confused -- this stuff gets confusing -- it was ldpath).

To do that, we'd have to figure out what parts of linkeddata ldpath actually needs though, which is confusing to do. In my limited experimentation, it clearly needs at least rdf and nokogiri, but may need more than that too, and I'm having understanding what are just incidental test infrastructure dependencies and what are actually features of ldpath, as well as what gem supplies what functionality.

jrochkind commented 4 years ago

If I try removing linkeddata and adding rdf and nokogiri, and running tests, I get a really confusing error message (one among others, but first I tried to tackle).

      RDF::FormatError:
        unknown RDF format: {:base_uri=>"http://www.bbc.co.uk/programmes/b0081dq5", :headers=>{"Accept"=>"", "User-Agent"=>"Ruby RDF.rb/3.0.13"}, :content_type=>"application/n-triples", :file_name=>"http://www.bbc.co.uk/programmes/b0081dq5"}
        This may be resolved with a require of the 'linkeddata' gem.

The error message around unknown format application/n-triples made me think maybe I needed a gem for n-triples support... but the linkeddata README says "Includes N-Triples and N-Quads support using RDF.rb," that is the rdf gem, which I already had as a dependency.

And then, um, it's telling us to use linkeddata again to resolve. This ecosystem is really confusing, it may be that practically there's no way to avoid bringing in the whole linkeddata omnibus metadistribution, it's too hard to tease out what pieces you need? Not sure.

jrochkind commented 4 years ago

OK, I think that rdf README was wrong.

By process of elimination, I have determined that tests can pass if you remove linkeddata as a dependency, and replace it with only: rdf, nokogiri, and rdf-turtle.

I think it's possible rdf-turtle is necessary only becuase of the nature of a test not actually for actual ldpath functionality, but I could be wrong about that, not totally sure.

I'm also not totally sure that just because tests pass, it means everything actually works with only those dependencies, or that it's safe to do that swap of dependencies -- I totally don't understand how this gem is actually used by whom in what ways etc.

Do you have any thoughts @no-reply ?

no-reply commented 4 years ago

you probably just need an explicit require 'rdf/ntriples' to get the ntriples format.

jrochkind commented 4 years ago

Thanks @no-reply !

I can keep hacking away to get tests to pass with as few dependencies as possible... but I'm not sure that will result in a PR that's wise to actually merge, I don't know how good test coverage is, or if there are going to be lots of downstream uses counting on this gem to supply a dependency that the downstream user is using.

I won't bother doing the work if it's going to result in a PR that's too dangeorus to actually merge, any thoughts? I don't really understand what's going on at all, anyone want to encourage or discourage me from preparing a PR that has tests passing with much reduced dependencies?

jrochkind commented 4 years ago

(Yep, just dependencies rdf and nokogiri, and tests can pass with an explicit require 'rdf/ntriples' -- I'm not sure if the gem code itself should do that require, or it's just something the test file needs to do in order to test the things the way it wants -- or if it makes any difference).

no-reply commented 4 years ago

I think it's possible rdf-turtle is necessary only becuase of the nature of a test not actually for actual ldpath functionality, but I could be wrong about that, not totally sure.

i think qa has a bunch of authorities with ldpath query functionality. if ldpath is causing problems for you and you aren't using those authorities, it may be productive to lobby qa to move that functionality out of its own core.

no-reply commented 4 years ago

I won't bother doing the work if it's going to result in a PR that's too dangeorus to actually merge, any thoughts?

it seems pretty safe to me. removing linkeddata will mostly just remove support for a bunch of RDF formats, all of which can be easily re-added by downstream apps.

jrochkind commented 4 years ago

It's not exactly causing problems, it's just bringing in a LOT of dependencies that you suggested might be unused/unnecessary. Which add to install time and size, and can cause dependency tree conflicts that are unfortunate if they are unnecessary. I don't totally understand how all these dependencies relate myself.

But okay, with that encouragement, I'll try preparing a PR to this ldpath to eliminate the omnibus linkeddata dependency, so things with ldpath in the dependency tree don't get quite so much baggage with it.

gkellogg commented 4 years ago

Nothing you’re using would seem to actually use nokogiri.

Without the linkeddata gem, you need to require each of the readers you actually depend on, such as rdf/ntriples and rdf/turtle.

no-reply commented 4 years ago

@jrochkind FYI: https://github.com/samvera/questioning_authority/issues/298