crawlserv crawlservpp issues

crawlserv / crawlservpp

crawlserv++: Application for crawling and analyzing textual content of websites.

Other

5 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Use string_view where possible

#166 crawlserv opened 1 year ago
0
Make source of public IP configurable via config file

#165 crawlserv opened 1 year ago
0
Server crashes on XPath query returning numerical result

#164 crawlserv closed 1 year ago
2
crawler: recover URL when quitting after restart

#163 crawlserv closed 2 years ago
0
restart crawler does not skip URLs inbetween custom URLs and URL processed last

#162 crawlserv closed 2 years ago
1
frontend > Analyzers: add tooltips to Category, Algorithm, and Name

#160 crawlserv opened 3 years ago
0
frontend: keep order of algorithm tabs

#158 crawlserv opened 3 years ago
1
check worker threads for website/URL list usage

#157 crawlserv opened 3 years ago
0
export: column-selection, CSV, extracted data

#156 crawlserv opened 3 years ago
0
implement zip, spreadsheet import

#155 crawlserv opened 3 years ago
0
extractor: vector out of range [rare]

#154 crawlserv opened 3 years ago
2
sentiment analyzer: minimum sentence length

#153 crawlserv opened 3 years ago
0
sentiment analyzer: make threshold optional

#152 crawlserv closed 3 years ago
1
cannot move queries to global

#151 crawlserv closed 3 years ago
0
sentiment analysis

#150 crawlserv closed 3 years ago
1
use constexpr unordered_set and unordered_map for Data::Sentiment

#149 crawlserv opened 3 years ago
0
import/export/merge dictionaries

#148 crawlserv opened 3 years ago
0
(optionally) delete previous corpus, if underlying data has changed and corpus is re-constructed?

#147 crawlserv opened 3 years ago
0
add dictionary replacer as word processor

#146 crawlserv opened 3 years ago
0
save number of URLs processed by thread to database

#145 crawlserv closed 3 years ago
0
parser: locale::facet::_S_create_c_locale name not valid.

#144 crawlserv opened 3 years ago
0
analyzer: findSavePoint can be substituted by simple MySQL condition

#143 crawlserv opened 4 years ago
0
analyzer: re-creating corpus even when savepoint is available

#142 crawlserv closed 3 years ago
0
analyzer: corrupted map(s) after filter by date

#141 crawlserv closed 4 years ago
0
crawler: warping on connection error does not work

#140 crawlserv opened 4 years ago
0
optimize padding of config strutures for crawler & extractor

#139 crawlserv opened 4 years ago
0
frontend: show tokenized corpus savepoints

#138 crawlserv opened 4 years ago
0
frontend: Changing a query adds a newline

#137 crawlserv closed 4 years ago
0
extractor: download linked data as JSON

#136 crawlserv closed 4 years ago
1
extractor: linked data not implemented yet

#135 crawlserv closed 4 years ago
0
extractor: variables.datetime.format & variables.datetime.locale not fully implemented

#134 crawlserv closed 4 years ago
0
server: exception handling

#133 crawlserv opened 4 years ago
1
frontend: do not show slash in front of cross-domain URLs in content tab

#132 crawlserv closed 4 years ago
0
use pugixml as submodule

#131 crawlserv opened 4 years ago
0
clang-tidy refactoring

#130 crawlserv closed 4 years ago
0
Main::Database: delete only 100 URLs at once

#129 crawlserv closed 4 years ago
0
Use Helper::FileSystem::getFreeSpace() to monitor available disk space

#127 crawlserv opened 4 years ago
0
server: separate thread(s) for URL deletion by query

#125 crawlserv closed 4 years ago
0
frontend: download corpus (selection) as txt or json

#124 crawlserv opened 4 years ago
0
analyzer: order articles before corpus creation (1st by datetime, SECOND by article id)

#122 crawlserv closed 4 years ago
1
frontend: test query immediately again

#121 crawlserv opened 4 years ago
0
frontend: save website when switching between configs

#120 crawlserv closed 4 years ago
0
analyzer: request corpus with callback function for setting the progress

#119 crawlserv closed 4 years ago
0
networking: sleep with custom callback function

#118 crawlserv closed 4 years ago
0
frontend: use prepared statements

#117 crawlserv opened 4 years ago
0
move #define constants either to config file or to configuration JSON

#116 crawlserv closed 4 years ago
1
extractor: bug in Query::Container

#115 crawlserv closed 4 years ago
1
analyzer: make corpus consistency check optional

#114 crawlserv closed 4 years ago
0
extractor: allow to parse user data (with unique user ids) into separate table

#113 crawlserv closed 4 years ago
1
extractor: option to get multiple pages from each page

#112 crawlserv closed 4 years ago
1