npms-io / npms-analyzer

The analyzer behind https://npms.io
MIT License
319 stars 38 forks source link

[WIP] feat: `dependents` and `dependencies` qualifiers #235

Open larsgw opened 5 years ago

larsgw commented 5 years ago

WIP:

Closes https://github.com/npms-io/npms/issues/29

satazor commented 5 years ago

@larsgw One issue with dependants is that it can be really huge, for instance, there’s hundreds, probably thousands, packages that depend on lodash. That’s why I wanted to understand the use-case and it seems that you only need dependencies.

Having said that, you need to set an array of dependency names based on the packageJson.dependencies & peerDepedencies, and index them on Elasticsearch. You must also to add that field to the elasticsearch config that lives in the /config folder. Note that an array of strings won’t allow you to search for things like 2.X; in that case, you would need an array objects, each object containing the name and the major version.

Once that is done, you just need to support a new modifier in npms/queries repo that use this new field in ES. What name for the modifier do you think it’s best? depends-on:xxx or has-dependency:xxx or other?

larsgw commented 5 years ago

@larsgw One issue with dependants is that it can be really huge, for instance, there’s hundreds, probably thousands, packages that depend on lodash. That’s why I wanted to understand the use-case and it seems that you only need dependencies.

Yeah, I'm mixing those up a bit. Let me get this straight: for searching dependents of a pkg we need in-dependencies:<pkg> and thus the package dependencies; for dependencies we'd need in-dependents:<pkg> and thus the entire list of dependents. Lodash has 85,770 dependents so that's a bit much, however I don't see a use case for that either.

Having said that, you need to set an array of dependency names based on the packageJson.dependencies & peerDepedencies, and index them on Elasticsearch. You must also to add that field to the elasticsearch config that lives in the /config folder.

Thank you, I couldn't find that part.

Note that an array of strings won’t allow you to search for things like 2.X; in that case, you would need an array objects, each object containing the name and the major version.

How would that work in ES?

Once that is done, you just need to support a new modifier in npms/queries repo that use this new field in ES. What name for the modifier do you think it’s best? depends-on:xxx or has-dependency:xxx or other?

I was thinking of continuing the keywords: naming scheme but those sound fine too.

satazor commented 5 years ago

How would that work in ES?

I think if we store an array of objects, we will need nesting: https://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html

I was thinking of continuing the keywords: naming scheme but those sound fine too.

You can choose what sounds best for you and we decide on the final name on the PR.

larsgw commented 5 years ago

I added dependencies and config for ES (I forgot about the major version part). What's going wrong with the tests?

daKmoR commented 5 years ago

anything we can do to help this along? 🤗

thepassle commented 5 years ago

Is this still in the works? If there's any way we could help out we'd love to hear it, though we may need some pointers/guidance as @satazor suggested here.

daKmoR commented 5 years ago

friendly reminder about this 🤗

I could start playing around myself to do an actual code contribution but that seems to almost exactly what is needed? so if there is any possibility to finish this one up would be nice 🤗

if not just let us know and we will try to prepare a PR ourself... 😅

satazor commented 5 years ago

Regarding the actual state of this PR, the dependents is an array that will really large for popular packages. I would say it's better to drop dependants from this PR and discuss how we can tackle it in a better (if there's still the need for it).

Other than that, the PR is looking good but there's still some work to do (checklist).

daKmoR commented 5 years ago

I'm a little confused with dependents and dependencies 🙈

As outlined here https://github.com/npms-io/npms/issues/29#issuecomment-469415321 the information that we are after is to find out which direct dependcies are used.

e.g. so I can say "Show me all packages that have lit-element 2.x as a direct dependency" so would this be dependents or dependencies? 😕 🤔

satazor commented 5 years ago

@daKmoR for that use-case, we only need dependencies. To avoid any confusion, the search modifier has to be well chosen. Something like has-dependency:xxx.

daKmoR commented 5 years ago

uh yeah that sounds way better 🤗 maybe even call it has-direct-dependency: lit-element@2.x ?

so I am all for dropping dependents and renaming dependencies to has-direct-dependency 🤗

@larsgw are still on board? or should I try my luck?

PS: I probably need to do an ElasticSearch course first - so might take a while 🤣

larsgw commented 5 years ago

I think removing dependents is fine, and the renaming too. I'm not great with ElasticSearch and I'm having trouble setting up the complete dev environment, so if you want to continue that's fine too.