Open exportio opened 5 years ago
Hi,
Interesting… I don't have the problem here. What is your python version ?
@samboustani Problem still present ?
Same problem here.
try this url: https://paperarchive.space/
@GovetaXV Hi,
Thanks for the link. Unfortunately the current version of python-sitemap doesn't support « full javascript » website, this is why the paperarchive.space doesn't work.
Sorry
+1 Same issue No error log
This looked pretty hopeful, but didn't work for me either. This isn't a full headless site by any means.
$ python3 main.py --domain https://canada.ca --output sitemap.xml --report
Number of found URL : 1
Number of links crawled : 1
Mikes-MBP-3:python-sitemap mikegifford$ cat sitemap.xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
</urlset>
But maybe this helps.
$ python3 main.py --domain https://canada.ca --output sitemap.xml --debug
INFO:root:Start the crawling process
INFO:root:Crawling #0: https://canada.ca
DEBUG:root:https://canada.ca ==> <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>
INFO:root:Crawling has reached end of all found links
Number of found URL : 1 Number of links crawled : 1
python main.py --domain https://www.domain.com --output sitemap.xml --report