DannyBen / snapcrawl

Crawl a website and take screenshots
MIT License
57 stars 12 forks source link

css_selector option seems to not work #48

Closed wburningham closed 1 year ago

wburningham commented 1 year ago

Let me know if I am just using the option incorrectly, but if not it seems like the css_selector option is not working.

Command run

docker run --rm -it --network host --volume "$PWD:/app" dannyben/snapcrawl https://gridfinity.xyz/specification/ depth=0 log_level=0 screenshot_delay=3 css_selector="#main_content"

Expected outcome

Just have a screenshot of the main content rather than the entire page

DannyBen commented 1 year ago

Correct. Bug.

I found the problem - but due to the fact that snapcrawl relies on aging dependencies, it gets harder to maintain. Tests are failing, and I am not sure I have the energy for it.

wburningham commented 1 year ago

Thanks for finding the problem. I'm no ruby/gem expert, but I can build my own docker images.

If I wanted to reference a commit from your PR to pull in the fix, what do I change on this line? https://github.com/DannyBen/docker-snapcrawl/blob/0c2405c9a15186eb28c52c265b7ea13b24548598/Dockerfile#L18C9-L18C16

DannyBen commented 1 year ago

You would have to change the line so that:

  1. You git clone the repository and cd into it.
  2. Run git checkout fix/css-selector to switch to the PR branch.
  3. Run bundle to install Ruby dependencies.
  4. Run bundle exec run gem build --install to build and install the gem.
DannyBen commented 1 year ago

I will either try harder to fix the failing tests, or archive the repository. The aging dependencies have no replacement.

DannyBen commented 1 year ago

Good news. Tests are passing. I will release a pre-release in a few, and if it works for you, I will release a version.

wburningham commented 1 year ago

Awesome! Thanks for the fast turn-around time.

DannyBen commented 1 year ago

Awesome! Thanks for the fast turn-around time.

I either do it quickly or not at all :)

Version 0.5.4.rc1 is on rubygems Docker will take a few more minutes

DannyBen commented 1 year ago

Docker is also updated. If it works for you - I will release 0.5.4.

wburningham commented 1 year ago

I just tested dannyben/snapcrawl:0.5.4.rc1 locally and it correctly captures using the css_selector

DannyBen commented 1 year ago

Alright. I will let it soak for the night and release a version tomorrow.

DannyBen commented 1 year ago

Fixed in version 0.5.4.