issues
search
medialab
/
minet
A webmining CLI tool & library for python.
GNU General Public License v3.0
286
stars
26
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Twitter search query cannot use double quotes
#990
Tyrannas
closed
1 week ago
3
when i try to extract comments from instagram post i get running time error
#989
wacns
closed
1 month ago
7
Fix typos again
#988
kianmeng
closed
1 month ago
1
Refactor RequestRetrying to avoid it altogether
#987
Yomguithereal
closed
1 month ago
0
Upgrade ural
#986
Yomguithereal
closed
2 months ago
0
Upgrade trafilatura
#985
Yomguithereal
closed
2 months ago
0
Retire minet.buzzsumo, minet.crowdtangle
#984
Yomguithereal
closed
2 months ago
0
Forward SelectionError to minet.scrape
#983
Yomguithereal
closed
2 months ago
0
Command to add jobs to a crawler's queue
#982
Yomguithereal
opened
3 months ago
0
CrawlJob data type should not be wrapped in an optional by default
#981
Yomguithereal
closed
2 months ago
0
Adjust twitter scraper retryer and rate limit (again)
#980
Yomguithereal
closed
2 months ago
0
Add more automatic context when Spider.process raises
#979
Yomguithereal
closed
2 months ago
0
KeyError: 'expanded_url' with minet twitter scrape tweets
#978
lakonis
closed
4 months ago
5
Issues (core dump or cannot unpack non-iterable FocusCrawlInfo object)
#977
TeaS0710
closed
2 months ago
1
potential changes in rate limit of twitter public API
#976
taniki
closed
2 months ago
3
When -c is not specified, we should default to test all available browsers instead of only firefox
#975
Yomguithereal
opened
5 months ago
1
-I should default to "downloaded" in scrape and extract
#974
Yomguithereal
closed
2 months ago
1
"Invalid Twitter cookie!" error (possibly due to migration from twitter.com to x.com ?)
#973
leomignot
closed
5 months ago
3
tiktok search-videos error
#972
csamson-sf
opened
6 months ago
3
Spider process exceptions should at least be raised with some context around them
#971
Yomguithereal
closed
2 months ago
0
Add FORWARD_SPIDER option
#970
Yomguithereal
closed
2 months ago
0
Error on wikipedia pageviews
#969
bmaz
opened
6 months ago
1
Scrapping 1000's of comments on Instagram
#968
Geminy3
opened
6 months ago
3
Retrieve videos from instagram hashtag function
#967
Tyrannas
opened
6 months ago
9
Improve ThreadsafeBrowser.request stability by retrying content acquisition if needed
#965
Yomguithereal
closed
7 months ago
0
Add LoadingBar.track
#964
Yomguithereal
closed
2 months ago
0
ThreadsafeBrowser enhancements
#963
Yomguithereal
opened
7 months ago
0
instagram post-infos should have line parity in the output and increase a stat rather than log
#962
Yomguithereal
closed
7 months ago
0
Refactor Crawler request_args as inheritance
#961
Yomguithereal
opened
7 months ago
0
Upgrade rich and other deps
#960
Yomguithereal
closed
7 months ago
0
Upgrade trafilatura and deal with lxml_html_clean
#959
Yomguithereal
closed
7 months ago
0
Spider process error should lead to errorred crawl?
#958
Yomguithereal
closed
7 months ago
1
Add a playwright version of the crawler
#957
Yomguithereal
closed
7 months ago
0
Upgrade to min version py3.8
#956
Yomguithereal
closed
7 months ago
0
Method of CrawlJob to get an identical CrawlTarget to retry
#955
Yomguithereal
closed
7 months ago
0
There should be a Crawler side global callback for each job
#954
Yomguithereal
opened
7 months ago
0
os.makedirs is already threadsafe
#953
Yomguithereal
closed
7 months ago
0
Move away from lxml as default soup engine?
#952
Yomguithereal
closed
7 months ago
0
Crawler should lazily open data files to be written until there is actually something to write
#951
Yomguithereal
closed
2 months ago
0
Add some crawler level job filter
#950
Yomguithereal
opened
7 months ago
0
Make minet installable without playwright
#949
Yomguithereal
opened
7 months ago
0
Youtube improvements
#948
Yomguithereal
closed
7 months ago
0
A challenges challenge
#947
Yomguithereal
opened
7 months ago
0
Multithreaded API clients connection pools should allow more connections
#946
Yomguithereal
closed
7 months ago
0
Add a "minet hal" module?
#945
boogheta
opened
8 months ago
0
x.com urls are not usually recognized
#944
Yomguithereal
closed
8 months ago
0
Command tw tweets should not require API key
#943
Yomguithereal
closed
8 months ago
0
ErroredCrawlReponse should de facto None response attributes
#942
Yomguithereal
closed
8 months ago
0
User friendly error message when spider returns a non-2-tuple
#941
Yomguithereal
closed
7 months ago
0
Adding a scraper for Facebook users' hometown and current city information
#940
camillechanial
closed
9 months ago
0
Next