issues
search
crawlserv
/
crawlservpp
crawlserv++: Application for crawling and analyzing textual content of websites.
Other
5
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Use string_view where possible
#166
crawlserv
opened
1 year ago
0
Make source of public IP configurable via config file
#165
crawlserv
opened
1 year ago
0
Server crashes on XPath query returning numerical result
#164
crawlserv
closed
1 year ago
2
crawler: recover URL when quitting after restart
#163
crawlserv
closed
2 years ago
0
restart crawler does not skip URLs inbetween custom URLs and URL processed last
#162
crawlserv
closed
2 years ago
1
frontend > Analyzers: add tooltips to Category, Algorithm, and Name
#160
crawlserv
opened
3 years ago
0
frontend: keep order of algorithm tabs
#158
crawlserv
opened
3 years ago
1
check worker threads for website/URL list usage
#157
crawlserv
opened
3 years ago
0
export: column-selection, CSV, extracted data
#156
crawlserv
opened
3 years ago
0
implement zip, spreadsheet import
#155
crawlserv
opened
3 years ago
0
extractor: vector out of range [rare]
#154
crawlserv
opened
3 years ago
2
sentiment analyzer: minimum sentence length
#153
crawlserv
opened
3 years ago
0
sentiment analyzer: make threshold optional
#152
crawlserv
closed
3 years ago
1
cannot move queries to global
#151
crawlserv
closed
3 years ago
0
sentiment analysis
#150
crawlserv
closed
3 years ago
1
use constexpr unordered_set and unordered_map for Data::Sentiment
#149
crawlserv
opened
3 years ago
0
import/export/merge dictionaries
#148
crawlserv
opened
3 years ago
0
(optionally) delete previous corpus, if underlying data has changed and corpus is re-constructed?
#147
crawlserv
opened
3 years ago
0
add dictionary replacer as word processor
#146
crawlserv
opened
3 years ago
0
save number of URLs processed by thread to database
#145
crawlserv
closed
3 years ago
0
parser: locale::facet::_S_create_c_locale name not valid.
#144
crawlserv
opened
3 years ago
0
analyzer: findSavePoint can be substituted by simple MySQL condition
#143
crawlserv
opened
4 years ago
0
analyzer: re-creating corpus even when savepoint is available
#142
crawlserv
closed
3 years ago
0
analyzer: corrupted map(s) after filter by date
#141
crawlserv
closed
4 years ago
0
crawler: warping on connection error does not work
#140
crawlserv
opened
4 years ago
0
optimize padding of config strutures for crawler & extractor
#139
crawlserv
opened
4 years ago
0
frontend: show tokenized corpus savepoints
#138
crawlserv
opened
4 years ago
0
frontend: Changing a query adds a newline
#137
crawlserv
closed
4 years ago
0
extractor: download linked data as JSON
#136
crawlserv
closed
4 years ago
1
extractor: linked data not implemented yet
#135
crawlserv
closed
4 years ago
0
extractor: variables.datetime.format & variables.datetime.locale not fully implemented
#134
crawlserv
closed
4 years ago
0
server: exception handling
#133
crawlserv
opened
4 years ago
1
frontend: do not show slash in front of cross-domain URLs in content tab
#132
crawlserv
closed
4 years ago
0
use pugixml as submodule
#131
crawlserv
opened
4 years ago
0
clang-tidy refactoring
#130
crawlserv
closed
4 years ago
0
Main::Database: delete only 100 URLs at once
#129
crawlserv
closed
4 years ago
0
Use Helper::FileSystem::getFreeSpace() to monitor available disk space
#127
crawlserv
opened
4 years ago
0
server: separate thread(s) for URL deletion by query
#125
crawlserv
closed
4 years ago
0
frontend: download corpus (selection) as txt or json
#124
crawlserv
opened
4 years ago
0
analyzer: order articles before corpus creation (1st by datetime, SECOND by article id)
#122
crawlserv
closed
4 years ago
1
frontend: test query immediately again
#121
crawlserv
opened
4 years ago
0
frontend: save website when switching between configs
#120
crawlserv
closed
4 years ago
0
analyzer: request corpus with callback function for setting the progress
#119
crawlserv
closed
4 years ago
0
networking: sleep with custom callback function
#118
crawlserv
closed
4 years ago
0
frontend: use prepared statements
#117
crawlserv
opened
4 years ago
0
move #define constants either to config file or to configuration JSON
#116
crawlserv
closed
4 years ago
1
extractor: bug in Query::Container
#115
crawlserv
closed
4 years ago
1
analyzer: make corpus consistency check optional
#114
crawlserv
closed
4 years ago
0
extractor: allow to parse user data (with unique user ids) into separate table
#113
crawlserv
closed
4 years ago
1
extractor: option to get multiple pages from each page
#112
crawlserv
closed
4 years ago
1
Next