ruipgil / scraperjs

A complete and versatile web scraper.
MIT License
3.7k stars 188 forks source link

jquery file injection failing #49

Closed tcarlock closed 8 years ago

tcarlock commented 8 years ago

When I use this module in my project an exception is being raised. Basically, the jquery script is failing to be injected because the file isn't present. The jquery dependency isn't being installed in scraperjs's node_modules directory so the hard-coded path in DynamicScraper fails.

Near as I can tell jquery is not being installed because it's already in the dependencies of my project, so I guess it's being deduped by NPM (I didn't think version 2 did this, but it's the only logical explanation).

I don't know of a reasonable way around this other than including the script file in the src directory.

ruipgil commented 8 years ago

Can you provide which versions of both node and npm are you running?

tcarlock commented 8 years ago

npm 2.7.4 and node v0.12.2. I've confirmed that taking out jquery from my dependencies results in jquery being installed in scraperjs's node_modules, which addresses the issue. So basically, NPM seems to be deduping the dependencies which breaks the hard-coded file load.

ruipgil commented 8 years ago

And only jquery is affected? You should also consider updating npm and node (to v0.12.x) if you can.

tcarlock commented 8 years ago

I am on node 12 and I don't have the option to update npm right now. That being said, I'm not sure how that would help. Does npm 3 allow me to control the de-duping behavior?

Yes, it's only an issue with jquery because you're referencing a file in that module, and the module is not being installed where you assume it will be (in node_modules of scraperjs) because I'm including it in my top-level dependencies.

Why not just include the jquery file in the src? I know it creates a little overhead, but at least then you can be sure that it will be present in the location you expect it to be.

ruipgil commented 8 years ago

JQuery source isn't and won't be included in the source, because it's a dependency, it evolves and so it needs to be updated.

The thing is, npm shouldn't be treating jquery in any special way. As any dependency, it should go to the node_modules folder.

Try this, in the end of the DynamicScraper file, insert console.log(require('path').resolve(__dirname, '/../node_modules/jquery/dist/jquery.min.js')), run your program and post the result.

tcarlock commented 8 years ago

I already did that and it complains that the file isn't present. Again, that's because jquery is not being installed in scraperjs/node_modules, which is because it's being installed in my project's node_modules directory. In short, npm IS treating jquery as a special case because I have jquery listed as a top-level dependency in my project.

I confirmed this by removing jquery from dependencies in my package.json. After doing that, jquery was installed in your library's node_modules directory and everything worked fine. I get your aversion to including the jquery file as a static file, but you made a bad assumption in building this: that the jquery dependency will always be in node_modules.

ruipgil commented 8 years ago

No, the assumption is correct, the documentation states,

npm install (in package directory, no arguments):

Install the dependencies in the local node_modules folder (...)

Anyway, you probably have a non-standard structure in your project, b19b611 should solve it, since there is no more hard-coded locations.

tcarlock commented 8 years ago

Yes, that does solve it, and deduping is the default behavior of NPM, so I'm not sure what to tell you. Thanks for the help.

ruipgil commented 8 years ago

I'm glad we solved it.

On Sat, Nov 7, 2015 at 10:15 PM, Tim Carlock notifications@github.com wrote:

Yes, that does solve it, and deduping is the default behavior of NPM, so I'm not sure what to tell you. Thanks for the help.

Reply to this email directly or view it on GitHub: https://github.com/ruipgil/scraperjs/issues/49#issuecomment-154757700