This is more something that should be done long term in the product that I might contribute to doing eventually.
Right now you can run into problems where the transformation of one url can also change others that partially match that url. If we moved all transforms to use lxml instead of regular expressions, it'd fix this issue. I had to do an ugly hack for now though...
Also, we should probably use lxml instead of beautiful soup since it's pretty standard.
This is more something that should be done long term in the product that I might contribute to doing eventually.
Right now you can run into problems where the transformation of one url can also change others that partially match that url. If we moved all transforms to use lxml instead of regular expressions, it'd fix this issue. I had to do an ugly hack for now though...
Also, we should probably use lxml instead of beautiful soup since it's pretty standard.