Efforts for preserving https://forum.kerbalspaceprogram.com/ for the posteriority if the worst happens. We are hoping for the best, but expecting the worst.
Note the %7B___base_url___%7D substring, that unencoded gives us {___base_url___}. Almost surely is a missing $ after the opening curly braces.
Curious about the issue, and knowing that this kind of issue reproduce like rabbits :P I coded a quick report for all the occurrences on the current (and WIP) WARCs, and boy, I found a lot (note: file in CSV format, ignore anything starting with #): [Uploading url_weirdities.csv…]()
The earliest thread with the problem is 278, and the biggest id is 209425.
Fixing the problem in the WARC file is out of the question (the thing need to be exactly as I fetched them), so we need to find a way to work around these problems.
A filter on the playback machine to detect and fix these will do but, so, we will need a cache to keep the thing responsible - python is not exactly the fastest cookie in the jar.
I found these two URLS on my "ALL" report this month (not meaning they weren't there before, I just noticed them today):
Note the
%7B___base_url___%7D
substring, that unencoded gives us{___base_url___}
. Almost surely is a missing$
after the opening curly braces.Curious about the issue, and knowing that this kind of issue reproduce like rabbits :P I coded a quick report for all the occurrences on the current (and WIP)
WARC
s, and boy, I found a lot (note: file in CSV format, ignore anything starting with#
): [Uploading url_weirdities.csv…]()The earliest thread with the problem is
278
, and the biggest id is209425
.'cat url_weirdities.csv | grep -Eo 'https://forum.kerbalspaceprogram.com/index\.php\?/topic/([0-9]+)-' | sed -E 's/^https:\/\/forum.kerbalspaceprogram.com\/index.php\?\/topic\/(.+?)-$/\1/g' | sort -n | uniq`
Fixing the problem in the
WARC
file is out of the question (the thing need to be exactly as I fetched them), so we need to find a way to work around these problems.A filter on the playback machine to detect and fix these will do but, so, we will need a cache to keep the thing responsible - python is not exactly the fastest cookie in the jar.