Closed milukyna closed 2 weeks ago
@milukyna Thank you so much for the suggestion. I totally agree with that. I will apply the fix. Again, I appreciate you for using the library and finding the bug and helping us to fix it. If you are interested, let me know your email address; I can invite you to our Discord channel to help us. Thank you so much.
Hi,
First of all thank you for your amazing work on this project! As I was using the tool, I found that in the newest version (
0.3.72
), the internal logic to extract internal links seem not to work with relative paths.What is happening
Example code
Problem
Suppose there is a relative internal link such as "blog/index.html". The above code will give
While we expect the following:
Bug Origin
I believe the problem arises form normalize_url in
utils.py
In the above code
domain
would corresponds tosome_url.com
while in the particular case of relative URLS, we want to keepsome_url.com/English
.Fix
Maybe it would be cleaner to use
urllib
that is specially designed in handling such situation.