Open Lucs1590 opened 3 years ago
Thank You! It was very helpful, I pretend to learn about, and I did not know much about "scrappers", so I searched for anything could help-me. More one time, Thank you.
Em ter., 5 de jan. de 2021 às 07:39, Lucas de Brito < notifications@github.com> escreveu:
Hi @Raisler https://github.com/Raisler , what's up? Dude, I took a look at your project and saw that you use selenium to make crawler and I also did a lot of this because 1º It's something practical and 2º works like a charm. However, this is not the core of selenium, since it was made for testing and there are some libraries/frameworks that are specialized for this (crawler), like Scrapy. Therefore, I recommend using the Scrapy framework to help you with this project. I believe that at least in performance, you will have a gain. If you are doing this project just to learn more about selenium, discard everything I wrote. Otherwise, the following tutorial can help you when implementing with Scrapy. http://pythonclub.com.br/material-do-tutorial-web-scraping-na-nuvem.html
Look that the next snippet of code is enough to get views, title and link videos.
import scrapy
def first(sel, xpath):
return sel.xpath(xpath).extract_first()
class YoutubeChannelLister(scrapy.Spider):
name = 'channel-lister' youtube_channel = 'portadosfundos' start_urls = ['https://www.youtube.com/user/%s/videos' % youtube_channel] def parse(self, response): for sel in response.css("ul#channels-browse-content-grid > li"): yield { 'link': response.urljoin(first(sel, './/h3/a/@href')), 'title': first(sel, './/h3/a/text()'), 'views': first(sel, ".//ul/li[1]/text()"), }
I hope I have helped!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Raisler/Youtube_Scrapy/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKWCQPJD6H6CJG73KGVFK53SYL265ANCNFSM4VU5ZMCA .
Hi @Raisler , what's up? Dude, I took a look at your project and saw that you use selenium to make crawler and I also did a lot of this because 1º It's something practical and 2º works like a charm. However, this is not the core of selenium, since it was made for testing and there are some libraries/frameworks that are specialized for this (crawler), like Scrapy. Therefore, I recommend using the Scrapy framework to help you with this project. I believe that at least in performance, you will have a gain. If you are doing this project just to learn more about selenium, discard everything I wrote. Otherwise, the following tutorial can help you when implementing with Scrapy. http://pythonclub.com.br/material-do-tutorial-web-scraping-na-nuvem.html
Look that the next snippet of code is enough to get views, title and link videos.
I hope I have helped!