Closed awallemo closed 3 years ago
Hi @awallemo, can you please share you sites.yaml?
Do I just copy the content of the file?
Yes, please. Or you can put it on pastebin or similar site.
/required, name of site
- name: factcheck.org
/ required, primary domain of factcheck.org
domain: factcheck.org
/ required, type of this site, it is a fact checking site
site_type: fact_checking
/ base URL, default by infer, the home of factcheck.org
base_url: http://www.factcheck.org/
/ site tags, default [], more about this site
site_tags: []
/ alternate domains, default [], secondary domains
/ that redirect to the primary domain
alternate_domains: []
/ is factcheck.org is alive, default true
is_alive: true
/ is factcheck.org is enabled, default true
/ when false, this site will not be tract and lucene search may ignore it
is_enabled: true
/ rules of how to crawl factcheck.org
article_rules:
/ regular expression of url we like to collect
/ right now this field does not used, please ignore
url_regex: ^http://www\.factcheck.org/20[0-2]\d/((0[1-9])|(1[0-2]))/[^/\s]+/?$
/ how to fetch the new articles update from factcheck.org
/ by default, page.spider, which crawls the home page of this site
update:
/ here we use feed.spider as we have RSS feed URL,
/ see hoaxy.crawl.spiders
- spider_name: feed.spider
/ the necessary parameters for building spider instance
spider_kwargs:
/ here, we need the RSS feed URLs
urls:
- http://www.factcheck.org/feed/
/ and also who providse the RSS feed
/ normally the website itself, sometimes a third party, e.g. feedburner
provider: self
/ how to fetch archive of factcheck.org
/ by default, site.page, which crawls the whole site.
archive:
/ here we use page_template.spider, as factcheck.org use a page
/ template to list all of its posted articles
- spider_name: page_template.spider
spider_kwargs:
/ a list of xpaths to extract links (to find @href)
/ by default, a python tuple('/html/body',) is used
/ to fetch all links in this page
/ here, we use specified xpath expression
/ Note: please do not include /a/@href part
href_xpaths:
- //article//header/h2
/ page templates of factcheck.org
/ increasing by page number
page_templates:
- http://www.factcheck.org/page/{p_num}
/ factcheck.org also provides sitemap.xml to help us collect all
/ links in this site
- spider_name: sitemap.spider
spider_kwargs:
/ these URLs could be actual sitemap URL
/ OR they could be the entry of a list of sitemap.xml files
/ the spider will follow all XML links and
/ assuming these XML file are sitemaps and extract non-xml
/ links
urls:
- http://www.factcheck.org/sitemap.xml
/Another site, thedcgazette.com
- name: thedcgazette.com
domain: thedcgazette.com
site_type: claim
base_url: http://thedcgazette.com/
site_tags:
- source: fakenewswatch.com
name: hoax
alternate_domains:
- is_alive: true
name: dcgazette.com
is_alive: true
is_enabled: true
article_rules:
url_regex:
update:
- spider_name: feed.spider
spider_kwargs:
urls:
- http://dcgazette.com/feed/
provider: self
archive:
- spider_name: page_template.spider
spider_kwargs:
href_xpaths:
- //div[@id="main-content"]/article/header/h3
page_templates:
- http://dcgazette.com/page/{p_num}/
The syntax of this file is not correct. In YAML files, comment lines should start with a the hashtag character (#), not with the front slash character (/). For example:
/required, name of site
- name: factcheck.org
/ required, primary domain of factcheck.org
Should be:
# required, name of site
- name: factcheck.org
# required, primary domain of factcheck.org
You can see an example here: https://github.com/IUNetSci/hoaxy-backend/blob/master/hoaxy/data/samples/sites.sample.yaml
Oh, yes, I manually changed this here so that it would be formatted in the correct size.
Could you please put it on pastebin.com and share the link here? https://pastebin.com/
You have two -archive
blocks, which I don't think is allowed.
Can you please try to run hoaxy init
using this file instead? https://github.com/IUNetSci/hoaxy-backend/blob/master/hoaxy/data/samples/sites.sample.yaml
You will need to rename it to sites.yaml.
I must have copy pasted wrong from my sites.yaml file. Here is the correct one https://pastebin.com/Y5AQVe9N
My sites.yaml file does not have two archive blocks
Thank you. I don't see anything odd in your file (it seems you are using the same file as the sample one).
How did you get hoaxy? Did you build it yourself or did use one of the images (dockers or Amazon AMI)?
Its deployed on an Amazon AWS EC2 instance.
Got it. I don't have access to EC2 right now so I cannot check, but perhaps @chathuriw can try to reproduce the problem on the AMI. I've marked the issue as a potential bug for her reference.
Okay, I'll wait for her then. Thank you!
I create a new instance with hoaxy-ami and enable the conda environment and ran hoaxy init --ignore-redirected --ignore-inactive
. It ran successfully.
(hoaxy) ubuntu@ip-172-31-39-191:~/hoaxy-backend$ hoaxy init --ignore-redirected --ignore-inactive 2021-01-29 19:40:09,323 - hoaxy(init) - INFO: Creating database tables: 2021-01-29 19:40:09,323 - hoaxy(init) - WARNING: Ignore existed tables 2021-01-29 19:40:09,344 - hoaxy(init) - INFO: Inserting platforms if not exist 2021-01-29 19:40:09,368 - hoaxy(init) - INFO: Trying to load site data: 2021-01-29 19:40:09,368 - hoaxy(init) - INFO: Claim domains /home/ubuntu/.hoaxy/domains_claim.txt found 2021-01-29 19:40:09,369 - hoaxy(init) - INFO: Sending HTTP requests to infer base URLs ... 2021-01-29 19:40:09,370 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTP connection (1): infowars.com:80 2021-01-29 19:40:09,404 - hoaxy(init)[urllib3.connectionpool] - DEBUG: http://infowars.com:80 "HEAD / HTTP/1.1" 301 0 2021-01-29 19:40:09,406 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): infowars.com:443 2021-01-29 19:40:09,482 - hoaxy(init)[urllib3.connectionpool] - DEBUG: https://infowars.com:443 "HEAD / HTTP/1.1" 301 0 2021-01-29 19:40:09,484 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): www.infowars.com:443 2021-01-29 19:40:09,594 - hoaxy(init)[urllib3.connectionpool] - DEBUG: https://www.infowars.com:443 "HEAD / HTTP/1.1" 200 0 2021-01-29 19:40:09,598 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTP connection (1): empirenews.net:80 2021-01-29 19:40:09,667 - hoaxy(init)[urllib3.connectionpool] - DEBUG: http://empirenews.net:80 "HEAD / HTTP/1.1" 301 0 2021-01-29 19:40:09,668 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): empirenews.net:443 2021-01-29 19:40:09,858 - hoaxy(init)[urllib3.connectionpool] - DEBUG: https://empirenews.net:443 "HEAD / HTTP/1.1" 200 0 2021-01-29 19:40:09,863 - hoaxy(init) - DEBUG: Insert or update site infowars.com 2021-01-29 19:40:09,864 - hoaxy(init) - DEBUG: Insert or update site empirenews.net 2021-01-29 19:40:09,864 - hoaxy(init) - INFO: Fact checking domains /home/ubuntu/.hoaxy/domains_factchecking.txt found 2021-01-29 19:40:09,864 - hoaxy(init) - INFO: Sending HTTP requests to infer base URLs ... 2021-01-29 19:40:09,865 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTP connection (1): snopes.com:80 2021-01-29 19:40:09,928 - hoaxy(init)[urllib3.connectionpool] - DEBUG: http://snopes.com:80 "HEAD / HTTP/1.1" 301 0 2021-01-29 19:40:09,929 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): snopes.com:443 2021-01-29 19:40:10,104 - hoaxy(init)[urllib3.connectionpool] - DEBUG: https://snopes.com:443 "HEAD / HTTP/1.1" 301 0 2021-01-29 19:40:10,106 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): www.snopes.com:443 2021-01-29 19:40:10,178 - hoaxy(init)[urllib3.connectionpool] - DEBUG: https://www.snopes.com:443 "HEAD / HTTP/1.1" 200 0 2021-01-29 19:40:10,182 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTP connection (1): factcheck.org:80 2021-01-29 19:40:10,225 - hoaxy(init)[urllib3.connectionpool] - DEBUG: http://factcheck.org:80 "HEAD / HTTP/1.1" 301 0 2021-01-29 19:40:10,226 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): factcheck.org:443 2021-01-29 19:40:10,497 - hoaxy(init)[urllib3.connectionpool] - DEBUG: https://factcheck.org:443 "HEAD / HTTP/1.1" 301 0 2021-01-29 19:40:10,499 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): www.factcheck.org:443 2021-01-29 19:40:10,558 - hoaxy(init)[urllib3.connectionpool] - DEBUG: https://www.factcheck.org:443 "HEAD / HTTP/1.1" 200 0 2021-01-29 19:40:10,563 - hoaxy(init) - DEBUG: Insert or update site snopes.com 2021-01-29 19:40:10,566 - hoaxy(init) - DEBUG: Insert or update site factcheck.org 2021-01-29 19:40:10,566 - hoaxy(init) - INFO: Site file /home/ubuntu/.hoaxy/sites.yaml found 2021-01-29 19:40:10,936 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTP connection (1): duffelblog.com:80 2021-01-29 19:40:10,944 - hoaxy(init) - ERROR: HTTPConnectionPool(host='duffelblog.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4bbbde0978>: Failed to establish a new connection: [Errno -5] No address associated with hostname')) 2021-01-29 19:40:10,945 - hoaxy(init)[urllib3.connectionpool] - DEBUG: Starting new HTTPS connection (1): duffelblog.com:443 2021-01-29 19:40:10,945 - hoaxy(init) - ERROR: HTTPSConnectionPool(host='duffelblog.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f4bbbde05f8>: Failed to establish a new connection: [Errno -5] No address associated with hostname')) 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Domain duffelblog.com is inactive! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site bigamericannews.com is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site christwire.org is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site drudgereport.com.co is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site empirenews.com is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site msnbc.co is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site msnbc.website is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site sprotspickle.com is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site duffleblog.com is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site nahadaily.com is inactive now! 2021-01-29 19:40:10,946 - hoaxy(init) - WARNING: Site libertymovementradio.com is inactive now! 2021-01-29 19:40:10,947 - hoaxy(init) - WARNING: Site wakingupwisconsin.com is inactive now! 2021-01-29 19:40:10,958 - hoaxy(init) - DEBUG: Insert or update site factcheck.org 2021-01-29 19:40:10,974 - hoaxy(init) - DEBUG: Insert or update site politifact.com 2021-01-29 19:40:10,979 - hoaxy(init) - DEBUG: Insert or update site opensecrets.org 2021-01-29 19:40:10,984 - hoaxy(init) - DEBUG: Insert or update site snopes.com 2021-01-29 19:40:10,990 - hoaxy(init) - DEBUG: Insert or update site truthorfiction.com 2021-01-29 19:40:10,996 - hoaxy(init) - DEBUG: Insert or update site hoax-slayer.com 2021-01-29 19:40:11,001 - hoaxy(init) - DEBUG: Insert or update site hoax-slayer.net 2021-01-29 19:40:11,006 - hoaxy(init) - DEBUG: Insert or update site badsatiretoday.com 2021-01-29 19:40:11,012 - hoaxy(init) - DEBUG: Insert or update site climatefeedback.org 2021-01-29 19:40:11,126 - hoaxy(init) - DEBUG: Insert or update site americannews.com 2021-01-29 19:40:11,133 - hoaxy(init) - DEBUG: Insert or update site civictribune.com 2021-01-29 19:40:11,142 - hoaxy(init) - DEBUG: Insert or update site clickhole.com 2021-01-29 19:40:11,263 - hoaxy(init) - DEBUG: Insert or update site thedcgazette.com 2021-01-29 19:40:11,271 - hoaxy(init) - DEBUG: Insert or update site dailycurrant.com 2021-01-29 19:40:11,280 - hoaxy(init) - DEBUG: Insert or update site dcclothesline.com 2021-01-29 19:40:11,286 - hoaxy(init) - DEBUG: Insert or update site derfmagazine.com 2021-01-29 19:40:11,299 - hoaxy(init) - DEBUG: Insert or update site duhprogressive.com 2021-01-29 19:40:11,306 - hoaxy(init) - DEBUG: Insert or update site enduringvision.com 2021-01-29 19:40:11,313 - hoaxy(init) - DEBUG: Insert or update site nationalreport.net 2021-01-29 19:40:11,321 - hoaxy(init) - DEBUG: Insert or update site newsbiscuit.com 2021-01-29 19:40:11,329 - hoaxy(init) - DEBUG: Insert or update site newsmutiny.com 2021-01-29 19:40:11,337 - hoaxy(init) - DEBUG: Insert or update site politicalears.com 2021-01-29 19:40:11,344 - hoaxy(init) - DEBUG: Insert or update site private-eye.co.uk 2021-01-29 19:40:11,351 - hoaxy(init) - DEBUG: Insert or update site realnewsrightnow.com 2021-01-29 19:40:11,357 - hoaxy(init) - DEBUG: Insert or update site rilenews.com 2021-01-29 19:40:11,364 - hoaxy(init) - DEBUG: Insert or update site thenewsnerd.com 2021-01-29 19:40:11,371 - hoaxy(init) - DEBUG: Insert or update site theuspatriot.com 2021-01-29 19:40:11,378 - hoaxy(init) - DEBUG: Insert or update site witscience.org 2021-01-29 19:40:11,388 - hoaxy(init) - DEBUG: Insert or update site amplifyingglass.com 2021-01-29 19:40:11,394 - hoaxy(init) - DEBUG: Insert or update site empiresports.co 2021-01-29 19:40:11,401 - hoaxy(init) - DEBUG: Insert or update site gomerblog.com 2021-01-29 19:40:11,410 - hoaxy(init) - DEBUG: Insert or update site huzlers.com 2021-01-29 19:40:11,416 - hoaxy(init) - DEBUG: Insert or update site itaglive.com 2021-01-29 19:40:11,431 - hoaxy(init) - DEBUG: Insert or update site politicops.com 2021-01-29 19:40:11,438 - hoaxy(init) - DEBUG: Insert or update site rockcitytimes.com 2021-01-29 19:40:11,446 - hoaxy(init) - DEBUG: Insert or update site thelapine.ca 2021-01-29 19:40:11,453 - hoaxy(init) - DEBUG: Insert or update site thespoof.com 2021-01-29 19:40:11,460 - hoaxy(init) - DEBUG: Insert or update site weeklyworldnews.com 2021-01-29 19:40:11,467 - hoaxy(init) - DEBUG: Insert or update site worldnewsdailyreport.com 2021-01-29 19:40:11,477 - hoaxy(init) - DEBUG: Insert or update site 21stcenturywire.com 2021-01-29 19:40:11,484 - hoaxy(init) - DEBUG: Insert or update site activistpost.com 2021-01-29 19:40:11,491 - hoaxy(init) - DEBUG: Insert or update site beforeitsnews.com 2021-01-29 19:40:11,498 - hoaxy(init) - DEBUG: Insert or update site chronicle.su 2021-01-29 19:40:11,507 - hoaxy(init) - DEBUG: Insert or update site coasttocoastam.com 2021-01-29 19:40:11,514 - hoaxy(init) - DEBUG: Insert or update site consciouslifenews.com 2021-01-29 19:40:11,521 - hoaxy(init) - DEBUG: Insert or update site countdowntozerotime.com 2021-01-29 19:40:11,527 - hoaxy(init) - DEBUG: Insert or update site counterpsyops.com 2021-01-29 19:40:11,535 - hoaxy(init) - DEBUG: Insert or update site dailybuzzlive.com 2021-01-29 19:40:11,542 - hoaxy(init) - DEBUG: Insert or update site disclose.tv 2021-01-29 19:40:11,548 - hoaxy(init) - DEBUG: Insert or update site fprnradio.com 2021-01-29 19:40:11,557 - hoaxy(init) - DEBUG: Insert or update site geoengineeringwatch.org 2021-01-29 19:40:11,565 - hoaxy(init) - DEBUG: Insert or update site globalresearch.ca 2021-01-29 19:40:11,576 - hoaxy(init) - DEBUG: Insert or update site govtslaves.info 2021-01-29 19:40:11,583 - hoaxy(init) - DEBUG: Insert or update site gulagbound.com 2021-01-29 19:40:11,589 - hoaxy(init) - DEBUG: Insert or update site hangthebankers.com 2021-01-29 19:40:11,596 - hoaxy(init) - DEBUG: Insert or update site humansarefree.com 2021-01-29 19:40:11,605 - hoaxy(init) - DEBUG: Insert or update site infowars.com 2021-01-29 19:40:11,612 - hoaxy(init) - DEBUG: Insert or update site intellihub.com 2021-01-29 19:40:11,622 - hoaxy(init) - DEBUG: Insert or update site lewrockwell.com 2021-01-29 19:40:11,629 - hoaxy(init) - DEBUG: Insert or update site libertytalk.fm 2021-01-29 19:40:11,637 - hoaxy(init) - DEBUG: Insert or update site naturalnews.com 2021-01-29 19:40:11,644 - hoaxy(init) - DEBUG: Insert or update site newswire-24.com 2021-01-29 19:40:11,651 - hoaxy(init) - DEBUG: Insert or update site nodisinfo.com 2021-01-29 19:40:11,658 - hoaxy(init) - DEBUG: Insert or update site nowtheendbegins.com 2021-01-29 19:40:11,665 - hoaxy(init) - DEBUG: Insert or update site pakalertpress.com 2021-01-29 19:40:11,674 - hoaxy(init) - DEBUG: Insert or update site politicalblindspot.com 2021-01-29 19:40:11,683 - hoaxy(init) - DEBUG: Insert or update site prisonplanet.com 2021-01-29 19:40:11,695 - hoaxy(init) - DEBUG: Insert or update site realfarmacy.com 2021-01-29 19:40:11,702 - hoaxy(init) - DEBUG: Insert or update site redflagnews.com 2021-01-29 19:40:11,712 - hoaxy(init) - DEBUG: Insert or update site thedailysheeple.com 2021-01-29 19:40:11,730 - hoaxy(init) - DEBUG: Insert or update site therundownlive.com 2021-01-29 19:40:11,738 - hoaxy(init) - DEBUG: Insert or update site unconfirmedsources.com 2021-01-29 19:40:11,746 - hoaxy(init) - DEBUG: Insert or update site veteranstoday.com 2021-01-29 19:40:11,756 - hoaxy(init) - DEBUG: Insert or update site worldtruth.tv 2021-01-29 19:40:11,760 - hoaxy(init) - DEBUG: Insert or update site thevalleyreport.com 2021-01-29 19:40:11,766 - hoaxy(init) - DEBUG: Insert or update site departed.co 2021-01-29 19:40:11,771 - hoaxy(init) - DEBUG: Insert or update site myfreshnews.com 2021-01-29 19:40:11,775 - hoaxy(init) - INFO: Added or updated sites are: [('infowars.com', 'claim', 'http://www.infowars.com/'), ('snopes.com', 'fact_checking', 'http://www.snopes.com/'), ('factcheck.org', 'fact_checking', 'http://www.factcheck.org/'), ('politifact.com', 'fact_checking', 'http://www.politifact.com/'), ('opensecrets.org', 'fact_checking', 'http://www.opensecrets.org/'), ('truthorfiction.com', 'fact_checking', 'https://www.truthorfiction.com/'), ('hoax-slayer.com', 'fact_checking', 'http://www.hoax-slayer.com/'), ('hoax-slayer.net', 'fact_checking', 'http://www.hoax-slayer.net/'), ('badsatiretoday.com', 'fact_checking', 'http://badsatiretoday.com/'), ('climatefeedback.org', 'fact_checking', 'http://climatefeedback.org/'), ('americannews.com', 'claim', 'http://americannews.com/'), ('civictribune.com', 'claim', 'http://civictribune.com/'), ('clickhole.com', 'claim', 'http://www.clickhole.com/'), ('thedcgazette.com', 'claim', 'http://thedcgazette.com/'), ('dailycurrant.com', 'claim', 'http://dailycurrant.com/'), ('dcclothesline.com', 'claim', 'http://www.dcclothesline.com/'), ('derfmagazine.com', 'claim', 'http://www.derfmagazine.com/'), ('duhprogressive.com', 'claim', 'http://duhprogressive.com/'), ('enduringvision.com', 'claim', 'http://www.enduringvision.com/'), ('nationalreport.net', 'claim', 'http://nationalreport.net/'), ('newsbiscuit.com', 'claim', 'http://www.newsbiscuit.com/'), ('newsmutiny.com', 'claim', 'http://newsmutiny.com/'), ('politicalears.com', 'claim', 'http://politicalears.com/'), ('private-eye.co.uk', 'claim', 'http://private-eye.co.uk/'), ('realnewsrightnow.com', 'claim', 'http://realnewsrightnow.com/'), ('rilenews.com', 'claim', 'http://www.rilenews.com/'), ('thenewsnerd.com', 'claim', 'http://www.thenewsnerd.com/'), ('theuspatriot.com', 'claim', 'http://theuspatriot.com/'), ('witscience.org', 'claim', 'http://witscience.org/'), ('amplifyingglass.com', 'claim', 'http://www.amplifyingglass.com/'), ('empiresports.co', 'claim', 'http://www.empiresports.co/'), ('gomerblog.com', 'claim', 'http://gomerblog.com/'), ('huzlers.com', 'claim', 'http://huzlers.com/'), ('itaglive.com', 'claim', 'http://itaglive.com/'), ('politicops.com', 'claim', 'http://politicops.com/'), ('rockcitytimes.com', 'claim', 'http://www.rockcitytimes.com/'), ('thelapine.ca', 'claim', 'https://thelapine.ca/'), ('thespoof.com', 'claim', 'http://www.thespoof.com/'), ('weeklyworldnews.com', 'claim', 'http://weeklyworldnews.com/'), ('worldnewsdailyreport.com', 'claim', 'http://worldnewsdailyreport.com/'), ('21stcenturywire.com', 'claim', 'http://21stcenturywire.com/'), ('activistpost.com', 'claim', 'http://www.activistpost.com/'), ('beforeitsnews.com', 'claim', 'http://beforeitsnews.com/'), ('chronicle.su', 'claim', 'http://chronicle.su/'), ('coasttocoastam.com', 'claim', 'http://www.coasttocoastam.com/'), ('consciouslifenews.com', 'claim', 'http://consciouslifenews.com/'), ('countdowntozerotime.com', 'claim', 'http://countdowntozerotime.com/'), ('counterpsyops.com', 'claim', 'https://counterpsyops.com/'), ('dailybuzzlive.com', 'claim', 'http://dailybuzzlive.com/'), ('disclose.tv', 'claim', 'http://www.disclose.tv/'), ('fprnradio.com', 'claim', 'http://fprnradio.com/'), ('geoengineeringwatch.org', 'claim', 'http://www.geoengineeringwatch.org/'), ('globalresearch.ca', 'claim', 'http://www.globalresearch.ca/'), ('govtslaves.info', 'claim', 'http://www.govtslaves.info/'), ('gulagbound.com', 'claim', 'http://gulagbound.com/'), ('hangthebankers.com', 'claim', 'http://www.hangthebankers.com/'), ('humansarefree.com', 'claim', 'http://humansarefree.com/'), ('intellihub.com', 'claim', 'https://www.intellihub.com/'), ('lewrockwell.com', 'claim', 'https://www.lewrockwell.com/'), ('libertytalk.fm', 'claim', 'http://libertytalk.fm/'), ('naturalnews.com', 'claim', 'http://www.naturalnews.com'), ('newswire-24.com', 'claim', 'https://newswire-24.com/'), ('nodisinfo.com', 'claim', 'http://nodisinfo.com/'), ('nowtheendbegins.com', 'claim', 'http://www.nowtheendbegins.com/'), ('pakalertpress.com', 'claim', 'http://www.pakalertpress.com/'), ('politicalblindspot.com', 'claim', 'http://politicalblindspot.com/ '), ('prisonplanet.com', 'claim', 'http://www.prisonplanet.com/'), ('realfarmacy.com', 'claim', 'http://www.realfarmacy.com/'), ('redflagnews.com', 'claim', 'http://www.redflagnews.com/'), ('thedailysheeple.com', 'claim', 'http://www.thedailysheeple.com/'), ('therundownlive.com', 'claim', 'http://therundownlive.com/'), ('unconfirmedsources.com', 'claim', 'http://unconfirmedsources.com/'), ('veteranstoday.com', 'claim', 'http://www.veteranstoday.com/'), ('worldtruth.tv', 'claim', 'http://worldtruth.tv/'), ('thevalleyreport.com', 'claim', 'https://thevalleyreport.com/'), ('departed.co', 'claim', 'http://departed.co/'), ('myfreshnews.com', 'claim', 'http://myfreshnews.com/')] 2021-01-29 19:40:11,775 - hoaxy(init) - INFO: Done.
Double check your site.yaml.
I ran the command you gave but I still get the same error. My sites.yaml file is the one I posted in the pastebin I commented earlier on. I dont see anything wrong there and I don't really understand what i should do in the sites.yaml file when I look at the error.
2021-01-29 21:59:09,622 - hoaxy(init) - INFO: Site file /home/ubuntu/.hoaxy/sites.yaml found
Traceback (most recent call last):
File "/home/ubuntu/anaconda2/envs/hoaxy/bin/hoaxy", line 11, in
TypeError: list indices must be integers or slices, not str
I dont really understand the error when I look at the sites.yaml file.
@chathuriw Can you please try with the same yaml file used by @awallemo ?
@awallemo, here is the correct yaml file. Your yaml file misses the 'sites' tag. (I had to rename it as txt to upload. Please remove it before you use it). Le t me know if you are still having issues.
It worked now, thanks!
2021-01-26 13:16:54,656 - hoaxy(init) - INFO: Site file /home/ubuntu/.hoaxy/sites.yaml found Traceback (most recent call last): File "/home/ubuntu/anaconda2/envs/hoaxy/bin/hoaxy", line 11, in
load_entry_point('hoaxy==0.1.0', 'console_scripts', 'hoaxy')()
File "/home/ubuntu/anaconda2/envs/hoaxy/lib/python3.7/site-packages/hoaxy-0.1.0-py3.7.egg/hoaxy/commands/cmdline.py", line 122, in main
File "/home/ubuntu/anaconda2/envs/hoaxy/lib/python3.7/site-packages/hoaxy-0.1.0-py3.7.egg/hoaxy/commands/init.py", line 138, in run
File "/home/ubuntu/anaconda2/envs/hoaxy/lib/python3.7/site-packages/hoaxy-0.1.0-py3.7.egg/hoaxy/commands/init.py", line 115, in init
File "/home/ubuntu/anaconda2/envs/hoaxy/lib/python3.7/site-packages/hoaxy-0.1.0-py3.7.egg/hoaxy/commands/site.py", line 368, in load_sites
TypeError: list indices must be integers or slices, not str
Whenever I run the hoaxy init command I get this error. Any recommendations on what to do?