issues
search
AgenteFarron
/
crawler-commons
Automatically exported from code.google.com/p/crawler-commons
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Do case-sensitive matching of paths
#68
GoogleCodeExporter
closed
8 years ago
1
AbstractSiteMaps lastmod date not taking time
#67
GoogleCodeExporter
closed
8 years ago
4
Use Forbidden API and explicit Locale and Charset
#66
GoogleCodeExporter
opened
8 years ago
4
[Sitemaps] Make SiteMapTool simpler by removing the Recursive flag
#65
GoogleCodeExporter
closed
8 years ago
7
Upgrade to Tika 1.7
#64
GoogleCodeExporter
closed
8 years ago
2
[Sitemaps] Parser assumes file encoding is the same as the one used by default on JVM
#63
GoogleCodeExporter
opened
8 years ago
8
[Sitemaps] Add new parseSiteMap method
#62
GoogleCodeExporter
closed
8 years ago
4
[Sitemaps] Sitemap Parser changes the processed flag unnecessarily
#61
GoogleCodeExporter
closed
8 years ago
6
[Sitemaps] Upgrade Valid / Legal / Strict SitemapUrls
#60
GoogleCodeExporter
opened
8 years ago
2
Let SimpleRobotRules and its members implements the Serializable interface
#59
GoogleCodeExporter
closed
8 years ago
5
Let SimpleRobotRules implements the Serializable interface
#58
GoogleCodeExporter
closed
8 years ago
1
[Sitemaps] SiteMap should contain a list of SitemapUrls instead of a table of them
#57
GoogleCodeExporter
closed
8 years ago
6
[Sitemaps] SiteMap.setBaseUrl(...) causes the domain name to be lowered case which shouldn't happen
#56
GoogleCodeExporter
closed
8 years ago
4
[Sitemaps] SitemapUrl "setPriority(String str)" should check for proper value
#55
GoogleCodeExporter
closed
8 years ago
3
Older version of SLF4J being used.
#54
GoogleCodeExporter
closed
8 years ago
2
Spaces in a comma separated list of names in a User-agent: line cause rules to be applicable to all agents
#53
GoogleCodeExporter
closed
8 years ago
5
[url] EffectiveTldFinder - Add a constructor which will auto download the latest TLD list
#52
GoogleCodeExporter
opened
8 years ago
0
Upgrade httpclient to the latest version
#51
GoogleCodeExporter
closed
8 years ago
3
Add Fetch Report to FetchedResult
#50
GoogleCodeExporter
closed
8 years ago
8
Proper code styling template and mechanism to enforce it
#49
GoogleCodeExporter
opened
8 years ago
0
Upgrade the Tika-core to v1.6
#48
GoogleCodeExporter
closed
8 years ago
2
[Sitemaps] SiteMapParser Tika detection doesn't work well on some cases
#47
GoogleCodeExporter
closed
8 years ago
11
[Sitemaps] SiteMapParser Tika detection doesn't work well on all cases
#46
GoogleCodeExporter
closed
8 years ago
4
[Sitemaps] Upgrade code after release of Tika v1.6
#45
GoogleCodeExporter
closed
8 years ago
5
Generate SitemapTool.jar from the SItemapTester
#44
GoogleCodeExporter
opened
8 years ago
7
[Sitemaps] Fix the Tester Util's Logic
#43
GoogleCodeExporter
opened
8 years ago
11
[Sitemaps] Add more JUnit tests
#42
GoogleCodeExporter
closed
8 years ago
8
[Sitemaps] Upgrade to JUnit v4 conventions
#41
GoogleCodeExporter
closed
8 years ago
3
[Sitemaps] Add Tika MediaType Support
#40
GoogleCodeExporter
closed
8 years ago
27
[Sitemaps] Add the Parser a conviniece method with only a URL argument
#39
GoogleCodeExporter
closed
8 years ago
6
[Sitemaps] Add Tika Support
#38
GoogleCodeExporter
closed
8 years ago
1
Upgrade the Slf4j logging Library to v1.7.7
#37
GoogleCodeExporter
closed
8 years ago
2
Support Image Sitemap's
#36
GoogleCodeExporter
opened
8 years ago
0
Support Video Sitemap's
#35
GoogleCodeExporter
opened
8 years ago
3
Upgrade the Slf4j logging in SiteMap's
#34
GoogleCodeExporter
closed
8 years ago
6
Add tar.gz artifact creation to Maven release profile
#33
GoogleCodeExporter
opened
8 years ago
4
[Robots] Resolve relative URL for sitemaps
#32
GoogleCodeExporter
closed
8 years ago
15
Missing top level domains
#31
GoogleCodeExporter
closed
8 years ago
2
SitemapIndex should allow to skip sitemaps
#30
GoogleCodeExporter
closed
8 years ago
2
More robust parsing of sitemap index files
#29
GoogleCodeExporter
closed
8 years ago
2
Sitemap URLs in robots.txt are unnecessarily lowercased
#28
GoogleCodeExporter
closed
8 years ago
1
[SiteMap] Unnecessary String concatenations when logging + in SiteMapURL.toString()
#27
GoogleCodeExporter
closed
8 years ago
1
Set correct default priority for URL in a sitemap file
#26
GoogleCodeExporter
closed
8 years ago
1
Robots.txt parser should not lowercase sitemap URLs
#25
GoogleCodeExporter
closed
8 years ago
3
Sitemap Parser to normalize entries
#24
GoogleCodeExporter
opened
8 years ago
2
Trivial improvements to UserAgent
#23
GoogleCodeExporter
closed
8 years ago
2
Use longest-match-wins approach to matching URLs in robots.txt
#22
GoogleCodeExporter
closed
8 years ago
2
Follow Google example of giving Allow directives higher match weight than Disallow directives
#21
GoogleCodeExporter
closed
8 years ago
6
Catch & report invalid robots.txt rules that include domain name in the URL path
#20
GoogleCodeExporter
opened
8 years ago
0
Suppport Google's "noindex" extension to robots.txt
#19
GoogleCodeExporter
opened
8 years ago
0
Next