lipoja / URLExtract

URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.
MIT License
245 stars 61 forks source link

Any string before :// is considered the scheme #168

Open vorenhoutgithub opened 3 months ago

vorenhoutgithub commented 3 months ago

Eg Ltd.https://archives.calvin.edu/index.php is extracted completely.

Expected behaviour: start the extract at a complete acceptable URI scheme, not something else.