linkextractor Search Results

354 results
for linkextractor

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

scrapinghub/portia #208

Links are not following while deploying Portia Spider

I am deploying my Portia spider in scrapyd. I have given a pattern to be followed in Crawling section in Portia. While deploying the spider, links are not following the link pattern which I have give…

drprabhakar updated 9 years ago
12
Norconex/crawlers #75

Using splitter

For my use case I need to store the title of a parent with the child document The references and titles of multiple items are in XML file, so it needs to be split. ``` Test http://www.test.com/tes…

OkkeKlein updated 9 years ago
5
spring-projects/spring-restdocs #8

Compile Error

I get this error when trying to compile the samples. You probably forgot to check something into github Happens with maven and gradle. [ERROR] /home/tsweets/projects/spring-restdocs/rest-notes-sprin…

tsweets updated 9 years ago
3
binux/pyspider #129

Full scraping #recursively

How to get around the site until it has links? Can show a simple example?

vortex14 updated 9 years ago
4
scrapinghub/hcf-backend #1

AttributeError: 'FrontierManager' object has no attribute 'e…

I have not investigated this yet: ``` (diffeo)stav@platu:~/Workspace/sh/Diffeo/diffeo-netsec$ scrapy crawl blackhat /home/stav/.virtualenvs/diffeo/src/scrapy/scrapy/contrib/linkextractors/sgml.py:107…

stav updated 9 years ago
2
itzg/docker-minecraft-server #14

[minecraft-server] option to use forge

I'm going to try and figure this out and maybe fork and merge, but what do you think about having an argument to use a forge server? Or should that be a totally different image?

danpolanco updated 9 years ago
54
DanMcInerney/xsscrapy #14

scope creep - crawling beyond the target site into other sit…

It seems like this is a feature. Give it -u http://a.example.com and if there is link to http://b.example.com then xsscrapy follows and tests it. But IMO that is a big mistake (as a default setting)…

mavensecurity updated 9 years ago
2
scrapy/scrapy #568

New selector method: extract_first()

I think about suggestion to improve scrapy Selector. I've seen this construction in many projects: ``` result = sel.xpath('//div/text()').extract()[0] ``` And what about `if result:` and `else:`, or…

shirk3y updated 9 years ago
54
scrapy/scrapy #780

_nons function is not used in scrapy.contrib.linkextractors.…

There is either a bug or some stray code in https://github.com/scrapy/scrapy/blob/master/scrapy/contrib/linkextractors/lxmlhtml.py#L37: `tag = _nons(el.tag)` local variable is not used, and so `_nons`…

kmike updated 10 years ago
2
scrapy/scrapy #76

Plugable Link Extractor backends

We want to had pluggable link extractor backends, maybe having a `LINKEXTRACTOR_CLASS` setting. Some backends that come to mind: pure-regex, scrapely, libxml2, lxml, sgml The sgml backend is not wor…

pablohoffman updated 10 years ago
4

上一页 1...30 31 32 33 34 35 36...36 下一页

354 results for linkextractor

354 results
for linkextractor