ibrooz94 / scrapy_first

This is a simple template for using Scrapy with Playwright for web scraping.
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Tutorial.txt question #1

Closed h0h0h0 closed 5 months ago

h0h0h0 commented 5 months ago

I've added ipython to the .cfg file on my newly created spider. But I don't understand your next line

fetch('')

Where do i execute that or place that?

ibrooz94 commented 5 months ago

Hello, the 'fetch' is for use within the shell to investigate into the website to be scraped.

First you enter the shell

scrapy shell

Then create a request using fetch

fetch('https://example.com')

Then you can use the response to make your investigations for your spider. My 'tutorial.txt' isn't really a tutorial per say, here's a link to Freecode camp's video tutorial which I used to get started with this.

https://www.freecodecamp.org/news/use-scrapy-for-web-scraping-in-python/

The 34th minute mark of the video highlights the scrapy shell

h0h0h0 commented 5 months ago

Cool I just grabbed this template cause it had scrapy and playwright and poetry together. I learning all three of them right now. heh.

ibrooz94 commented 5 months ago

I used this to learn them too, that's why this repo actually exists😅. That's what the 'tutorial.txt' was for, a mini refresher for me. I wish you good luck.

Also head's up if you're on Windows, Playwright + Scrapy doesn't work on windows you'll have to use WSL