ibrooz94 / scrapy_first

This is a simple template for using Scrapy with Playwright for web scraping.
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

project questions #2

Open h0h0h0 opened 6 months ago

h0h0h0 commented 6 months ago

Did you intend of the project to become a template for others to use? I grabbed this because I was looking for a scrapy/playwright template that was setup already. It came up in the search. Or was this just a scratchpad for you?

  1. if you intended for it to be a template we can setup github templates
  2. We can make the tutorial.txt a little better. I can send a patch for that.
  3. I wonder if there is a way to make create spider already have the playwright stuff included. I did not realize i needed to add the playwright configurations to a spider. it would have been nice to have that in the tutorial.txt or something.

Anyway, I don't know what the intention was. If it was not meant for others to use and consume, my apologies!! I stumbled and used this in error. Otherwise I would send a patch with the above changes.

Thanks dude!

ibrooz94 commented 6 months ago

It was initially just a scratchpad, I also put it out here to provide insight to others so yes, we can setup Github templates.

For number 3. It's possible for enable the playwright configurations for spiders globally. I set it up individually so the other spiders not requiring playwright can run without getting slowed down.

https://github.com/ibrooz94/scrapy_first/blob/6c8dfd1a7db3b1f41c5e7719373729dea6dd7508/bookscraper-src/bookscraper/spiders/quotespider.py#L11-L17

You just need to move the lines above to the settings.py

Awaiting your patch. Thanks!