c4software / python-sitemap

Mini website crawler to make sitemap from a website.
GNU General Public License v3.0
366 stars 110 forks source link

Stack overflow error #14

Closed dchaplinsky closed 10 years ago

dchaplinsky commented 10 years ago

$ python3 main.py --domain http://ua.shop-ink.su --output sitemap.xml Fatal Python error: Cannot recover from stack overflow.

Current thread 0x00007fff7edeb180: .... File "/Users/dchaplinsky/Projects/python-sitemap/crawler.py", line 201 in __continue_crawling File "/Users/dchaplinsky/Projects/python-sitemap/crawler.py", line 197 in __crawling ... Abort trap: 6

c4software commented 10 years ago

Hi,

Its strange, you've got the « Fatal Python error: Cannot recover from stack overflow. » immediatelly?

dchaplinsky commented 10 years ago

I just omitted repeating strings.

Test it for yourself, I gave you exact params.

I've fixed it by crunching stack depth to 1500, but as you can understand it's a bad fix. Nobody does such things recursive, as you cannot predict the depth of scan.

c4software commented 10 years ago

Fixed in the latest version.

https://github.com/c4software/python-sitemap/commit/508e490e7a4ebe4cead8616277b926d55ae8529a