vctfence / scrapbee

Mozilla Public License 2.0
39 stars 23 forks source link

Scrapbee fails to capture full content sometimes #29

Closed balbirthomas closed 3 years ago

balbirthomas commented 4 years ago

Hi Sometimes Scrapbee fails to capture content on a page. For example try

https://www.nutsvolts.com/magazine/article/can-you-trust-your-voltmeter

If you drag the mouse and select the article on the page from the picture on the top (above the title) all the way down including the table at the bottom, then only the top picture and title are captured. On the other hand if you skip top picture, the title and most importantly the "View in Digital Edition", the rest of the article can be captured. The problem occurs when trying to save content that includes the link "View in Digital Edition". This is just an Anchor tag. I do not understand why it causes the problem. Perhaps Scrapbee can just ignore such things, really all that is needed is to capture images and text.

I am using Scrapbee 1.8.6 downloaded from https://addons.mozilla.org/en-US/firefox/addon/scrapbee/

Thank you for Scrapbee

Kerenok commented 4 years ago

In my environment, Scrapbee can save this page.

vctfence commented 4 years ago

Please try 1.11.9

balbirthomas commented 4 years ago

Hi @vctfence ,

Thank you very much. I just tested and can confirm that this problem seems to have been fixed. I am closing the bug report for this reason.

balbirthomas commented 4 years ago

Seems the problem to capture some pages still occurs. In this case both alternatives - capture page or capture selection fails. Here is an example https://www.tjmahr.com/quantile-quantile-plots-from-scratch/ . I am using scrapbee 2.0.1.

vctfence commented 4 years ago

Seems banner image is missing, right?

balbirthomas commented 4 years ago

Seems banner image is missing, right?

Yes that is correct. But if I remember correctly a few other images were also missing lower down on the page. I have seen a few other examples of such problems. If it will be useful for you I can gather a few more examples.

vctfence commented 4 years ago

Ok please, that will be helpful, and please try 2.0.5 with a bug fixed about background image.

balbirthomas commented 3 years ago

Apologies for the delay replying. i have not found any significant issues recently. I am currently using scrapbee 2.1.0 and its behavior is much improved. Thank you. I think we can close this bug. I think there will always be edge cases that will trip scrapbee given how chaotic the world wide web is. If I see anything in the future I will create a separate bug report with details.