apify / actor-templates

This project is the :house: home of Apify actor template projects to help users quickly get started.
https://apify.com/
25 stars 14 forks source link

feat: add wrapper templates #223

Closed barjin closed 10 months ago

barjin commented 10 months ago

Related to apify/apify-cli#388 and apify/apify-cli#393.

The Scrapy wrapper template is based on, but not a 1:1 copy of the Scrapy Actor template. By maintaining both templates in this repository, we can reuse some of the CI/CD logic and maybe even generate one from the other later on.

barjin commented 10 months ago

@B4nan what do you think? The only things that could be of your interest are the new manifest for the wrappers (I slimmed that slightly down, opposed to the templates manifest) and the slightly updated build script, so it builds the wrappers as well.

Oh, and the fetchManifest method that now has an optional parameter.

Merging + releasing this is currently blocking the apify-cli PR, so I'd like to have this out asap so we can move on with final round of testing there.

B4nan commented 10 months ago

Yeah looking good, sorry for the delay.

I was also thinking about the package yesterday, and I'd say we should be fine without it as well if the migration is not destructive and only adds new files - we could probably replace old files added by a previous migration with newer versions on a second apify init --scrapy, right? (under a flag for sure, without it we should probably throw saying that the project seems to be migrated already)

barjin commented 10 months ago

Huh, now that's something I didn't think of. I assume it shouldn't cause too many problems - the only thing that could wreak havoc is concatenating the .gitignore + .dockerignore files... which is quite useful, since without it, git will start tracking the /storage folder after the first apify run.

Btw, how do we release this two-year-old package? 😬

B4nan commented 10 months ago

The templates are downloaded on the fly in the CLI, it's not a package, right? Not sure if they are bundled to the CLI too like we do it in crawlee.

Huh, now that's something I didn't think of. I assume it shouldn't cause too many problems - the only thing that could wreak havoc is concatenating the .gitignore + .dockerignore files... which is quite useful, since without it, git will start tracking the /storage folder after the first apify run.

We could still merge things and deduplicate the lines probably. Or just skip those files, the problems will be usually in the python code and not really in this part I think. Or the command could first copy the original files somewhere, e.g. adding .old extension or having a dedicated folder for that. Just brainstorming...

barjin commented 10 months ago

downloaded on the fly in the CLI

They are indeed, but there is a package for fetching the manifest from the correct URL etc. I guess we could just move around this, I just kinda liked the fact that there is no hardcoded URL in the dependent projects.

https://www.npmjs.com/package/@apify/actor-templates

B4nan commented 10 months ago

Maybe that's some artifact from the past, and nowadays the templates are always just downloaded, given the package is 2 years old...

B4nan commented 10 months ago

If it's needed, we can publish manually.

barjin commented 10 months ago

They are fetched manually, but the manifest is still downloaded through this package, see here in apify-cli.

If you could publish this with your token, that would be great (I'm not in the maintainers org on NPM).

B4nan commented 10 months ago

Right, you changed the implementation of the fetching. Ok will give it a try locally.

B4nan commented 10 months ago

Here you go, I hope its enough like this, it only includes the index file as nothing else is referenced from it.

https://www.npmjs.com/package/@apify/actor-templates?activeTab=code

barjin commented 10 months ago

seems to be working, cheers 👍🏽