custom-crawler Search Results

1000+ results
for custom-crawler

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

infinilabs/crawler #42

How to customize search and crawler api?

It crawled some useless content of sites like site navmenu, header and footer, how to remove them from crawler or search api?

wxs77577 updated 5 years ago
2
FlowiseAI/Flowise #2327

[FEATURE] Web scrappers - ignore / remove some elements or a…

Hello, I have a flowise workflow to web scrape our entire web (150+ pages) and then save it to Pinecone. We are currently using Cheerio Web scrapper node. (it could be Puppeteer, Playwright - it does…

bendadaniel updated 4 weeks ago
4
openzim/zimit #376

Puppeteer errors causing program termination

I was using Zimit to archive the SCP-CN Wikidot site and encountered an interruption of the program due to a puppeteer error. Attached here is the log output before the program exits. ``` {"timesta…

MCSeekeri updated 2 weeks ago
16
openzim/zim-requests #1052

edu.gcfglobal.org ZIMs are all missing some content

### ZIM(s) location https://library.kiwix.org/#lang=&q=gcf ### Recipe(s) URL https://farm.openzim.org/recipes?name=edu.gcfglobal.org ### Readers tested - [ ] Kiwix-serve on iOS (iPad / iPhone) - …

benoit74 updated 3 months ago
7
webrecorder/browsertrix-crawler #486

Make screenshot after custom behaviors

Currently it seems screenshot are made before custom behaviors. It could be very interesting to be able a post-custom behaviors screenshot. For example to capture screenshot after removing the "acc…

cmillet2127 updated 1 month ago
5
data-dot-all/dataall #1429

Custom confidentialty mapping should be in dataset_base inst…

**Is your idea related to a problem? Please describe.** Since confidentiality is a business classification type, it should be in the datasets_base section. With new changes in v2.6, the custom confi…

TejasRGitHub updated 1 week ago
4
alecxe/scrapy-fake-useragent #12

Custom folder for local Cache DB

I did a small tweak so we can have custom folder for local cache DB. ` class RandomUserAgentMiddleware(object): def __init__(self, crawler): super(RandomUserAgentMiddleware, self).__…

bezkos updated 7 years ago
1
TeamHG-Memex/scrapy-rotating-proxies #40

Refresh the list of proxies during scraping

Hello i find the load of the list of proxies in from_crawler (middleware.py) : the load is in a constructor of object. i read this in a good site of scraping : " ...write some code that would a…

dibodin updated 4 months ago
11
azerothcore/azerothcore-wotlk #7961

Shellfish Trap doesn't spawn Drysnap Crawler

### Current Behaviour No Drysnap Crawler is spawned. It never awards more than one shellfish. ### Expected Blizzlike Behaviour Opening a Shellfish Trap should sometime spawn an aggressive Drysnap …

vtihomirov updated 11 months ago
1
matomo-org/matomo #5186

"Inverse" Custom Event tracking - to find bad elements/funct…

If you have a page with some different of elements on it like - some images with lightbox - a video - elements which can be unfolded - etc. The usage of these elements could be tracked via Custom Eve…

hpvd updated 1 week ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for custom-crawler

1000+ results
for custom-crawler