issues
search
medialab
/
minet
A webmining CLI tool & library for python.
GNU General Public License v3.0
281
stars
26
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fix typos again
#988
kianmeng
closed
1 day ago
1
Refactor RequestRetrying to avoid it altogether
#987
Yomguithereal
closed
2 days ago
0
Upgrade ural
#986
Yomguithereal
closed
3 weeks ago
0
Upgrade trafilatura
#985
Yomguithereal
closed
4 weeks ago
0
Retire minet.buzzsumo, minet.crowdtangle
#984
Yomguithereal
closed
4 weeks ago
0
Forward SelectionError to minet.scrape
#983
Yomguithereal
closed
4 weeks ago
0
Command to add jobs to a crawler's queue
#982
Yomguithereal
opened
2 months ago
0
CrawlJob data type should not be wrapped in an optional by default
#981
Yomguithereal
closed
4 weeks ago
0
Adjust twitter scraper retryer and rate limit (again)
#980
Yomguithereal
closed
4 weeks ago
0
Add more automatic context when Spider.process raises
#979
Yomguithereal
closed
4 weeks ago
0
KeyError: 'expanded_url' with minet twitter scrape tweets
#978
lakonis
closed
3 months ago
5
Issues (core dump or cannot unpack non-iterable FocusCrawlInfo object)
#977
TeaS0710
closed
4 weeks ago
1
potential changes in rate limit of twitter public API
#976
taniki
closed
4 weeks ago
3
When -c is not specified, we should default to test all available browsers instead of only firefox
#975
Yomguithereal
opened
4 months ago
1
-I should default to "downloaded" in scrape and extract
#974
Yomguithereal
closed
4 weeks ago
1
"Invalid Twitter cookie!" error (possibly due to migration from twitter.com to x.com ?)
#973
leomignot
closed
4 months ago
3
tiktok search-videos error
#972
csamson-sf
opened
4 months ago
3
Spider process exceptions should at least be raised with some context around them
#971
Yomguithereal
closed
4 weeks ago
0
Add FORWARD_SPIDER option
#970
Yomguithereal
closed
4 weeks ago
0
Error on wikipedia pageviews
#969
bmaz
opened
4 months ago
1
Scrapping 1000's of comments on Instagram
#968
Geminy3
opened
4 months ago
3
Retrieve videos from instagram hashtag function
#967
Tyrannas
opened
5 months ago
9
Improve ThreadsafeBrowser.request stability by retrying content acquisition if needed
#965
Yomguithereal
closed
5 months ago
0
Add LoadingBar.track
#964
Yomguithereal
closed
4 weeks ago
0
ThreadsafeBrowser enhancements
#963
Yomguithereal
opened
5 months ago
0
instagram post-infos should have line parity in the output and increase a stat rather than log
#962
Yomguithereal
closed
5 months ago
0
Refactor Crawler request_args as inheritance
#961
Yomguithereal
opened
5 months ago
0
Upgrade rich and other deps
#960
Yomguithereal
closed
5 months ago
0
Upgrade trafilatura and deal with lxml_html_clean
#959
Yomguithereal
closed
5 months ago
0
Spider process error should lead to errorred crawl?
#958
Yomguithereal
closed
5 months ago
1
Add a playwright version of the crawler
#957
Yomguithereal
closed
5 months ago
0
Upgrade to min version py3.8
#956
Yomguithereal
closed
5 months ago
0
Method of CrawlJob to get an identical CrawlTarget to retry
#955
Yomguithereal
closed
5 months ago
0
There should be a Crawler side global callback for each job
#954
Yomguithereal
opened
5 months ago
0
os.makedirs is already threadsafe
#953
Yomguithereal
closed
5 months ago
0
Move away from lxml as default soup engine?
#952
Yomguithereal
closed
5 months ago
0
Crawler should lazily open data files to be written until there is actually something to write
#951
Yomguithereal
closed
4 weeks ago
0
Add some crawler level job filter
#950
Yomguithereal
opened
6 months ago
0
Make minet installable without playwright
#949
Yomguithereal
opened
6 months ago
0
Youtube improvements
#948
Yomguithereal
closed
6 months ago
0
A challenges challenge
#947
Yomguithereal
opened
6 months ago
0
Multithreaded API clients connection pools should allow more connections
#946
Yomguithereal
closed
5 months ago
0
Add a "minet hal" module?
#945
boogheta
opened
6 months ago
0
x.com urls are not usually recognized
#944
Yomguithereal
closed
6 months ago
0
Command tw tweets should not require API key
#943
Yomguithereal
closed
6 months ago
0
ErroredCrawlReponse should de facto None response attributes
#942
Yomguithereal
closed
6 months ago
0
User friendly error message when spider returns a non-2-tuple
#941
Yomguithereal
closed
5 months ago
0
Adding a scraper for Facebook users' hometown and current city information
#940
camillechanial
closed
7 months ago
0
path column is not very useful when using --glob file on scrape/extract
#939
Yomguithereal
opened
7 months ago
0
Adding builtin scraper for Europresse
#938
bmaz
closed
7 months ago
0
Next