-
Hi,
Thank you for this great project --we find it quite useful for our conversational AI projects.
Let me explain the feature I'm suggesting: In current / default behavior, when time is missing it i…
-
## Reproduction
I'm using this method to parse a german date without year:
```python
dateparser.parse('13.01.', languages=['de'])
```
What I get returned is a `datetime` object with the current d…
-
We are in need to get the size (in bytes) and time duration of downloaded URLs on Splash. Example, all embedded images and CSS pages details.
While executing "[window.performance](https://developer…
-
I was developing a crawler using Splash when suddenly i started to receive a lot of gateway timeouts. Trying to troubleshooting the problem, i discover the cause of this is header ```transfer-encoding…
-
There are examples of using cookies in the docs, but no examples of setting method and body. I think it would be useful to add it, or perhaps even add the following class (with a better name): with it…
-
Let's talk about visibility of request/response bodies in HAR as generated by [`splash:har`](https://splash.readthedocs.io/en/stable/scripting-ref.html#splash-har):
- For response bodies, globally:…
-
Hi,
If there is any exception with response parsing in scrapy, the request remain marked as `QUEUED` and no error is logged on the frontier. …
-
Hi,
I am using frontera revisiting Backend. The spider scraping previously scraped items. How can I make sure that there will be no duplicates?
Here is my frontera settings.
```
BACKEND = 'fro…
-
I saw the request is replaced with dont_filter=True, if I remove that the spider will just stop when it gets to the same url.
I need to use the offsite middleware though, so any thoughts?
I will do …
-
I tried many async requests by 15 threads to splash like that
```python
async with aiohttp.ClientSession() as session:
async with session.get(
"http://localhost:8050/render.html",
…