issues
search
diegov
/
searchbox
Personal crawling and indexing
GNU General Public License v3.0
2
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Replace parsel with BeautifulSoup
#42
diegov
opened
10 months ago
0
Fix errors when body content is missing or in an invalid format
#41
diegov
closed
1 year ago
0
Improve tokenizing of URL
#40
diegov
opened
1 year ago
0
Added indexing of starred Github gists
#39
diegov
closed
1 year ago
0
Bump requests from 2.28.1 to 2.31.0
#38
dependabot[bot]
opened
1 year ago
0
Url spider routing
#37
diegov
closed
1 year ago
0
UberSpider and decoupling spiders from Scrapy
#36
diegov
opened
1 year ago
1
Fixed bug which caused existing attributes to be overwritten by empty…
#35
diegov
closed
1 year ago
0
Bump cryptography from 39.0.0 to 39.0.1
#34
dependabot[bot]
closed
1 year ago
1
Updated dependencies
#33
diegov
closed
1 year ago
0
Bump scrapy from 2.6.1 to 2.7.1
#32
dependabot[bot]
closed
1 year ago
1
Use dataclass instead of scrapy item
#31
diegov
closed
1 year ago
0
Improve title display for items with short non-descriptive titles
#30
diegov
closed
1 year ago
0
Bump certifi from 2021.10.8 to 2022.12.7
#29
dependabot[bot]
closed
1 year ago
1
Bump twisted from 22.2.0 to 22.10.0
#28
dependabot[bot]
closed
1 year ago
1
Bump twisted from 22.2.0 to 22.4.0
#27
dependabot[bot]
closed
1 year ago
1
Bump scrapy from 2.6.1 to 2.6.2
#26
dependabot[bot]
closed
1 year ago
1
Bump lxml from 4.8.0 to 4.9.1
#25
dependabot[bot]
closed
1 year ago
1
Add option to process cached results.
#24
diegov
closed
2 years ago
0
Updated dependencies
#23
diegov
closed
2 years ago
0
Add original item timestamp as fallback for recursively crawled items
#22
diegov
opened
2 years ago
0
Bump scrapy from 2.5.1 to 2.6.0
#21
dependabot[bot]
closed
2 years ago
1
Replace single body param in elastic search query with individual params
#20
diegov
closed
2 years ago
0
Don't try to calculate mean when there are no results
#19
diegov
closed
2 years ago
0
Add weight to terms that are part of a hyperlink
#18
diegov
opened
2 years ago
0
Bump lxml from 4.6.4 to 4.6.5
#17
dependabot[bot]
closed
2 years ago
1
Haystack integration
#16
diegov
opened
2 years ago
0
Wayback machine fallback when a request fails
#15
diegov
opened
2 years ago
0
Fix stdev when fewer than 2 results
#14
diegov
closed
2 years ago
0
Infer publication date from `time` html elements when nothing else is available, and possibly validate aganst URL parts
#13
diegov
opened
2 years ago
0
Updated scrapy version, with related fixes
#12
diegov
closed
2 years ago
0
lxml error when parsing documents that contain encoding declaration
#11
diegov
opened
2 years ago
0
Bump scrapy from 2.5.0 to 2.5.1
#10
dependabot[bot]
closed
2 years ago
1
Bump urllib3 from 1.26.4 to 1.26.5
#9
dependabot[bot]
closed
2 years ago
0
Added gitlab stars crawling plus some refactorings
#8
diegov
closed
3 years ago
0
Bump lxml from 4.6.2 to 4.6.3
#7
dependabot[bot]
closed
3 years ago
0
Bump urllib3 from 1.26.2 to 1.26.3
#6
dependabot[bot]
closed
3 years ago
1
Bump cryptography from 3.3.1 to 3.3.2
#5
dependabot[bot]
closed
3 years ago
0
extruct
#4
diegov
closed
3 years ago
0
Extract tags from embedded metadata using extruct
#3
diegov
closed
3 years ago
0
Bump lxml from 4.6.1 to 4.6.2
#2
dependabot[bot]
closed
3 years ago
0
Bump cryptography from 3.1 to 3.2
#1
dependabot[bot]
closed
3 years ago
2