c4software / python-sitemap

Mini website crawler to make sitemap from a website.
GNU General Public License v3.0
366 stars 110 forks source link

iframe contents ignored #90

Open Garrett-R opened 6 months ago

Garrett-R commented 6 months ago

Iframes are sometimes used to have parts of sites controlled by a CMS.

It would nice to have the option of inspecting the iframe's content and for any links that are to the site being indexed, having those included.

It would have to take into account the tag, since if the base tag matches the site being indexed, then all relative URLs should be crawled.