Closed ohidurbappy closed 3 years ago
Hey, messy in what way? You mean the results won't be the expected ones?
Hey, messy in what way? You mean the results won't be the expected ones?
@alirezamika Suppose, we have 3 block of text each with 400 words. Think about the condition, when we put 3×400 words in a script!!
I see. Adding support for regular expressions would be nice. You can also add them in a separate file for now.
Hello, I have the same problem, I'm getting a block of text by innerText and sometimes this does not get matched.
Hello, I have the same problem, I'm getting a block of text by innerText and sometimes this does not get matched.
I'm not sure what your problem is exactly, but you may want to adjust the text_fuzz_ratio
while calling the build method.
In the last version (v1.1.10) you can use regular expressions as wanted items:
wanted_list = [re.compile('Lorem ipsum.+est laborum')]
When our target value is a large block of text, it becomes messy. Instead can a feature be added so that we can define the text shortly?
For example:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
can be defined as:
Lorem ipsum(...)est laborum