darrenjennings / algolia-docsearch-action

runs the docsearch scraper and updates an index
29 stars 15 forks source link

[IMPROVE] Make the Action faster #3

Closed aditya-mitra closed 3 years ago

aditya-mitra commented 3 years ago

Short Description

Instead of using the docker image by algolia for docsearch, this pr uses the source repository for scrapping and uploading to algolia.

Details

Using the source repository, removes the use of jq, installation of docker-cli, the algolia docsearch docker image, and other peer dependencies.

Using the python:3.6 as the base image (which comes with git preinstalled), first the algolia-docsearch repository is git cloned. pipenv is installed and then pipenv installed the packages in the Pipfile.

With these setup, now we can easily run:

python docsearch run $GITHUB_WORKSPACE/$FILE

Improvements

The running time of the action has now reduced by 40 seconds.

You can have a look at the comparison run in here

image

Further Comments

I have made a few other fixes/corrections like correcting the spelling of algolia.

Also, I changed the config.example.json since it took a lot of time to index and the difference could not have been made clear.

Related Issues

Closes #2

aditya-mitra commented 3 years ago

Please do squash and merge :smiley: I have made some irrelevant commits

darrenjennings commented 3 years ago

released latest version and published to marketplace https://github.com/marketplace/actions/algolia-docsearch-indexer

aditya-mitra commented 3 years ago

this is great, I think the speed of the action outweighs the cons. I originally used docker because using the algolia/docsearch-scraper image meant we did not have to worry about what was being used under the hood (maybe tomorrow they rewrite it in rust ). Thanks, I don't use this action anymore so if you could test it as needed and let me know of any issues!

Thanks a lot!

I thought I could further improve the speed (at least another reduction by 25 seconds) if we used a custom docker image with the docsearch repository installed. And with the docker's caching systems, we would only be requiring the final layer.

But, I thought publishing a Docker image to docker hub would require further discussions and publishing the image to docker hub.