Closed fhamborg closed 7 years ago
Hi Felix,
afp.com
It seems that this site is a little odd: you can visit it by www.afp.com
, but the domain afp.com
does not exist. Here is some ping information:
ping afp.com
ping: unknown host afp.com
ping www.afp.com
PING e10157.e12.akamaiedge.net (23.194.98.182) 56(84) bytes of data.
64 bytes from a23-194-98-182.deploy.static.akamaitechnologies.com (23.194.98.182): icmp_seq=1 ttl=56 time=7.41 ms
When hoaxy reads the domain list, it will ignore the prefix www.
, so that www.afp.com
will be treated as afp.com
. There are several ways to resolve this problem. One is by using a YAML file to load this site. You can check the sample file sites.sample.yaml
. Here is an example of this site:
### afp.com
# required, name of site
- name: afp.com
# required, primary domain of factcheck.org
domain: afp.com
# required, type of this site, it is a fact checking site
site_type: YOU SITE TYPE
# base URL, USING www.afp.com
base_url: http://www.afp.com/
# site tags, default [], more about this site
...
Please check https://github.com/IUNetSci/hoaxy-backend/blob/master/hoaxy/data/manuals/sites.readme.md
Another way is by altering the database. When loading the domain list, using --force-inactive
to force loading this site, and then update the table site
, the SQL command could be:
UPDATE site
SET base_url='http://www.apf.com/, is_alive=True
WHERE domain LIKE 'apf.com'
afp.com/de
Sorry to say that currently, hoaxy could not track site based on URLs. Maybe in the future, we can provide some kind of filter hook to apply filtering. Right now, what you could do is just using domain afp.com
to track all related URLs (of course, this will include afp.com/de
).
Thanks
Closing issue for now; @fhamborg, feel free to reopen if there is any follow up question you would like to ask to Chengcheng. Thanks!
When running
hoaxy init
with adomains_factchecking.txt
that contains the following lineI get the following error
However, when visiting the domain in my browser, everything seems to work fine.
A second issue is that afp.com is actually publishing in English, but we would like to access the German version that is available at afp.com/de. However, hoaxy only accepts domains and not URLs. Any workaround for that?