issues
search
MusicConnectionMachine
/
UnstructuredData
In this project we will be scanning unstructured online resources such as the common crawl data set
GNU General Public License v3.0
3
stars
1
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Timeout during download kills process
#223
felixschorer
opened
7 years ago
0
0.0.1 Release
#222
felixschorer
closed
7 years ago
3
final report PDF
#221
nbasargin
closed
7 years ago
0
Update README.md
#220
felixschorer
closed
7 years ago
0
Importing the data to the database
#219
pfent
closed
7 years ago
3
Fixed bug where website cache didn't get deleted after saving to cloud
#218
felixschorer
closed
7 years ago
1
File will now be considered "finished" if zlib encounters an error
#217
felixschorer
closed
7 years ago
0
Added retries to WetManager
#216
felixschorer
closed
7 years ago
0
Fixed bug where processes tried to delete already deleted queue items
#215
felixschorer
closed
7 years ago
0
Improved performance
#214
felixschorer
closed
7 years ago
0
Added option to store metadata as JSON files
#213
felixschorer
closed
7 years ago
0
Updated scripts
#212
lukasstreit
closed
7 years ago
0
Increased the speed of populating the queue
#211
felixschorer
closed
7 years ago
0
Added correct content length and removed unnecessary block digest
#210
felixschorer
closed
7 years ago
0
Fixed unintended retry loop
#209
felixschorer
closed
7 years ago
0
Hotfix for Storer
#208
felixschorer
closed
7 years ago
0
Worker objects no longer access global parameters
#207
felixschorer
closed
7 years ago
4
Added term loader retry
#206
lukasstreit
closed
7 years ago
0
Fixed various retries
#205
felixschorer
closed
7 years ago
0
Fixed retry for database connection
#204
lukasstreit
closed
7 years ago
0
Deleted configsample.json
#203
felixschorer
closed
7 years ago
0
Added method to shrink down web page content to only relevant bits
#202
felixschorer
closed
7 years ago
1
Cleaned up the project
#201
felixschorer
closed
7 years ago
0
Report
#200
nbasargin
closed
7 years ago
10
Implemented #197
#199
felixschorer
closed
7 years ago
2
add a new column to the csv file
#198
goldbergtatyana
closed
7 years ago
1
Add CLI options for queue manipulation
#197
felixschorer
closed
7 years ago
1
Expand blacklist and gather heuristic statistics
#196
felixschorer
closed
7 years ago
20
Added missing '!'...
#195
felixschorer
closed
7 years ago
1
Make file paths always relative to project root
#194
felixschorer
closed
7 years ago
1
Added way to block terms via term-blacklist.txt
#193
felixschorer
closed
7 years ago
0
Added blacklist with terms that we should not match
#192
lukasstreit
closed
7 years ago
0
Added null check since some terms in the db were null for some reason.
#191
lukasstreit
closed
7 years ago
0
Fixed tsconfig to exlude scripts folder
#190
lukasstreit
closed
7 years ago
0
Updated README
#189
felixschorer
closed
7 years ago
0
Added deployment scripts and script for progress checking.
#188
lukasstreit
closed
7 years ago
0
Fixed fatal error which prevented Worker from working correctly
#187
felixschorer
closed
7 years ago
1
Renamed dbDatabase to dbName and removed duplicate code
#186
felixschorer
closed
7 years ago
0
Added dbDabase parameter to config file
#185
lukasstreit
closed
7 years ago
0
Remove duplicate param init
#184
felixschorer
closed
7 years ago
1
Remove legacy code
#183
felixschorer
closed
7 years ago
2
database website bulk creation
#182
lukasstreit
closed
7 years ago
0
Added support for Azure queues
#181
felixschorer
closed
7 years ago
1
Send kill signal to workers
#180
sacdallago
closed
7 years ago
11
Progress indication
#179
sacdallago
closed
7 years ago
10
Now collecting all entries until flush() is called
#178
felixschorer
closed
7 years ago
0
Now calling callback with error on db connection error.
#177
felixschorer
closed
7 years ago
0
added error check & reconnect after creating connection to db.
#176
lukasstreit
closed
7 years ago
0
Error communication in Worker.ts
#175
felixschorer
closed
7 years ago
0
Updated submodule
#174
lukasstreit
closed
7 years ago
0
Next