EdJoPaTo / website-stalker

Track changes on websites via git
GNU Lesser General Public License v2.1
54 stars 6 forks source link

Latest version breaks Github Action #188

Closed bgervan closed 1 year ago

bgervan commented 1 year ago

Describe the bug The Github Action started to remove all folders and files from the repo before running fetching the websites.

Versions

Run EdJoPaTo/website-stalker-github-action@v1
  with:
    version: latest
    triple: x86_64-unknown-linux-gnu
  env:
    WEBSITE_STALKER_FROM: ***
Run base=https://api.github.com/repos/EdJoPaTo/website-stalker/releases/
+ base=https://api.github.com/repos/EdJoPaTo/website-stalker/releases/
+ '[' latest == latest ']'
+ base+=latest
+ releases=/tmp/website-stalker-releases.json
+ curl --header 'authorization: ***' --output /tmp/website-stalker-releases.json https://api.github.com/repos/EdJoPaTo/website-stalker/releases/latest
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 [2](https://github.com/WebstormitDev/paddle-billing-client/actions/runs/6093419225/job/16533065513#step:2:2)8968  100 28968    0     0  96594      0 --:--:-- --:--:-- --:--:-- 96882
+ jq . /tmp/website-stalker-releases.json
{
  "url": "https://api.github.com/repos/EdJoPaTo/website-stalker/releases/12001226[3](https://github.com/WebstormitDev/paddle-billing-client/actions/runs/6093419225/job/16533065513#step:2:3)",
  "assets_url": "https://api.github.com/repos/EdJoPaTo/website-stalker/releases/1200122[6](https://github.com/WebstormitDev/paddle-billing-client/actions/runs/6093419225/job/16533065513#step:2:6)3/assets",
  "upload_url": "[https://uploads.github.com/repos/EdJoPaTo/website-stalker/releases/120012263/assets{?name](https://uploads.github.com/repos/EdJoPaTo/website-stalker/releases/120012263/assets%7B?name),label}",
  "html_url": "https://github.com/EdJoPaTo/website-stalker/releases/tag/v0.21.0",
  "id": 120012263,
.....

Expected behavior Works as before with the same config.

Additional information

Run website-stalker run --all
  website-stalker run --all
  shell: /usr/bin/bash -e {0}
  env:
    WEBSITE_STALKER_FROM: ***
Warning: Remove superfluous "assets/images/coverage.svg"
INFO: Some sites are on the same host. There is a wait time of [5](https://github.com/WebstormitDev/paddle-billing-client/actions/runs/6093419225/job/16533065513#step:5:5) seconds between each request to the same host in order to reduce load on the server.
Warning: Remove superfluous "docker/Dockerfile"
Warning: Remove superfluous "docker/README.md"
Warning: Remove superfluous "paddle_billing_client/__init__.py"
Warning: Remove superfluous "paddle_billing_client/client.py"
Warning: Remove superfluous "paddle_billing_client/endpoints.py"
Warning: Remove superfluous "paddle_billing_client/formatters.py"
Warning: Remove superfluous "paddle_billing_client/models/__init__.py"
Warning: Remove superfluous "paddle_billing_client/models/address.py"
Warning: Remove superfluous "paddle_billing_client/models/adjustment.py"
Warning: Remove superfluous "paddle_billing_client/models/base.py"
Warning: Remove superfluous "paddle_billing_client/models/business.py"
.......
bgervan commented 1 year ago

Fix: Add working-directory: sites to action steps

- name: Check website-stalker config
  working-directory: sites
  run: website-stalker check

- name: Run website-stalker
  working-directory: sites
  run: website-stalker run --all

Move website-stalker.yml (config) to the used working directory, in the above case sites folder

EdJoPaTo commented 1 year ago

This is basically the intended behaviour of #187 as this tool assumes a clean repository with only website-stalker.yaml and the stalked files. (Hidden files and folders like .git or .github are preserved) I am not sure about working-directory as it probably also breaks the --commit feature.

An option could be an additional option to define the root folder of the sites. The downside would be more logic. Alternatively every site entry could be in sites again with the domain structure inside the sites folder.

Out of curiosity: why do you need other content in the same repository? Is there a use case I wasn't aware of yet?

bgervan commented 1 year ago

The new behaviour is fine, but it broke the already used github action out of nowhere. Is there a config in the github action where we can specify the desired version?

EdJoPaTo commented 1 year ago

Sorry for the breaking change there with a different setup than I expected.

There is a version input which should work:

https://github.com/EdJoPaTo/website-stalker-github-action/blob/a1699261970e3d2573a655a2051bd9be056530c8/action.yml#L7-L10

bgervan commented 1 year ago

Thanks, I will use that too. No worries, we can close this, my comments shows the solution in case others ran into the same problem

EdJoPaTo commented 1 year ago

Your comment is a workaround which will prohibit --commit of working correctly. I'm still curious about your use case and the thought of putting the files into sites again is not dead.

bgervan commented 1 year ago

I have a repo which is a API wrapper and I am using the stalker to check updates for the API doc. I am using the sites folder to store the fetched txt files. Basically I am using the same approach as the commit which introduced the change. The config was in the root and the stalker downloaded the websites into "sites" folder by default, with the lates version I moved the config to the sites folder and changed the action steps accordingly, so the stalker only removed all the files inside the sites folder.

I am not sure to understand what is prohibited by my approach and usage. The above solution requires a manual work on the commit to remove the old files though, if that is what you mean.

bgervan commented 1 year ago

https://github.com/EdJoPaTo/website-stalker/commit/fd9163fc376ee1913600fa24501e616983d2d90c This commit contains the same changes that I mentioned, the yml is moved to sites folder etc.