Open rumca-js opened 5 months ago
Some pages are geolocalized, and may destroy titles to provide localized texts. I we run web crawling behind VPN we could solve that.
Some pages do not provide clean title for web crawlers, therefore we could provide 'lock title & description' for notorius pages.
Some link titles and descriptions cannot be correctly obtains. This is because of cloudflare, and other protection mechanisms.
Some entries were edited manually. If page changes drastically. For example another company takes over domain, title and description should be changed.
Currently we only 'update' data, which does not change title and description at all, it focuses more on filling fields that were not filled, and setting current status code of page.