apify / actor-templates

This project is the :house: home of Apify actor template projects to help users quickly get started.
https://apify.com/
26 stars 18 forks source link

Add templates using the Apify SDK for Python #111

Closed fnesveda closed 1 year ago

fnesveda commented 1 year ago

After we have the Apify SDK for Python implemented, we need to add actor templates that will use it.

There two main decisions / tasks:

mnmkng commented 1 year ago

Personally, I would rewrite the getting started one, which showcases the platform and its features and then I would add a crawling one, so probably Scrapy? Not sure if we need a beautiful soup one.

Other stuff I don't know, but if you say it's good, then we should probably have it 😅

fnesveda commented 1 year ago

It depends on whether we will have the nice template selector in the console soon, or not. If we don't, then I wouldn't add too many templates, because there's already too many and they already don't fit. If yes, then we can add more and have them all in some nice Python category.

metalwarrior665 commented 1 year ago

@fnesveda Do you have a plan for how to do the Scrapy-to-actor mapping as their structure is more complicated than Crawlee's? They have multiple spiders per folder with top-level libraries if I recall so they would need to select the spider via input or somehow CD into it in Dockerfile after copying the libraries?

fnesveda commented 1 year ago

No plan yet. I think we can have the template simple, optimized just for one spider, and solve these complicated things in the scrapy migrator.

The Scrapy "multiple spiders per project" philosophy does not really align with the "do just one thing but do it well" UNIX (and actor) philosophy.