Closed alea12 closed 4 months ago
Thank you for the work! I've tested with my website and this fits well with my needs. One issue I had is that this tool crawls the same URL twice, with and without a trailing slash (/):
This could be addressed by updating the conditions of uniqueSubLinks. If this looks good to you, I could submit a PR.
https://github.com/laiso/site2pdf/blob/c3385864ce69c6cac66d6160751cbeff2d73e71a/index.ts#L30-L39
@alea12 URL normalization is a great approach. I have been interested in this as well. Please go ahead and create a pull request. thank you.
Thank you for the work! I've tested with my website and this fits well with my needs. One issue I had is that this tool crawls the same URL twice, with and without a trailing slash (/):
This could be addressed by updating the conditions of uniqueSubLinks. If this looks good to you, I could submit a PR.
https://github.com/laiso/site2pdf/blob/c3385864ce69c6cac66d6160751cbeff2d73e71a/index.ts#L30-L39