-
When scraping the ranking of movies on Douban, the message "IE 11 is not supported. For an optimal experience, visit our site on another browser" appears. I also encountered the same problem when scra…
-
**Description**
Change of button state should be clearly announced for user to understand what happened. Especially for the user using assistive technology.
**Preconditions**
Stateful Web crawlers -…
-
The entities marked as reserved [here](https://do.remifa.so/archives/unicode/latin1.html) (scroll down to see the list) are extracted literally by `lxml`, whereas it should probably strive for more co…
-
Hi, When I exacted brain using `antspynet.utilities.brain_extraction` according to AntsPyNet document (https://antsx.github.io/ANTsPyNet/docs/build/html/utilities.html#applications), an error happened…
-
### What problem are you trying to solve?
Currently, data is being extracted from the DOM using JavaScript, which can be inefficient and slow, especially for complex or large documents. This method m…
-
## Overview Description
The text extraction fails, after a html attribute localization in quoted signs.
## Steps to Reproduce
Run extraction on the following javascript template string code:
```…
-
# What?
When we produce (from the HOCR/PDFALTO) extraction the pure OCR text we keep the HTML entity encoding. This hurts Views display since internally, twig can not decode the entities and will d…
-
I performed the demos of both the regular text extraction and the HTML extraction found on the README. The text extraction worked as expected. However, the HTML extraction simply returned the original…
-
固件:
https://www.tenda.com.cn/download/detail-3901.html
环境:
Ubuntu 20.04 + 编译好的 binwalk
已知:
单独使用 `binwalk -Me US_AX1806V2.0br_v1.0.0.1_cn_2997_ZGDX01.bin` 可以解出 `_US_AX1806V2.0br_v1.0.0.1_cn_2…
-
**Issue by [will3216](https://github.com/will3216)**
_Wed Nov 4 18:48:57 2015_
_Originally opened as https://github.com/codelucas/newspaper/issues/168_
----
In extractors.py:173 it says that the p…