Closed 360path closed 4 years ago
I need a way automate it.
I was thinking about a few options here:
base-href
where the patched files resideupstream
where the downloaded files are kept untouchedKeep it simple.
I would download the pages' html into a source folder. From there the script could regex find <head ...> </head>
and insert the base
tag just after the opening and save the output as a generated file where the current pages are.
I'm not proficient in shell scripting. But maybe this helps?
https://community.idera.com/database-tools/powershell/ask_the_experts/f/learn_powershell_from_don_jones-24/17942/add-html-to-an-existing-web-page
Adding the line is easy. Something like this would do the trick (changes the files 'in place'):
% sed -i '/<head>/a<base href="https://www.sozialministerium.at/">' Neuartiges-Coronavirus-\(2019-nCov\).html
% sed -i '/<head>/a<base href="https://www.sozialministerium.at/">' Coronavirus---Haeufig-gestellte-Fragen.html
I think it would be the easiest to just apply the change to the downloaded files and avoid tracking "source" and "generated" files. Any objections?
No objections. I was just referring to your ideas:
creating a separate branch?
- something like
base-href
where the patched files reside- something like
upstream
where the downloaded files are kept untouched- some kind of Makefile which patches the downloaded files if you want to view them
Adding the line is easy. Something like this would do the trick (changes the files 'in place'):
% sed -i '/<head>/a<base href="https://www.sozialministerium.at/">' Neuartiges-Coronavirus-\(2019-nCov\).html % sed -i '/<head>/a<base href="https://www.sozialministerium.at/">' Coronavirus---Haeufig-gestellte-Fragen.html
I think it would be the easiest to just apply the change to the downloaded files and avoid tracking "source" and "generated" files. Any objections?
Didn't have a look at the source. But NB: the <head>
tag could have attributes, then this wouldn't work anymore?
Didn't have a look at the source. But NB: the
<head>
tag could have attributes, then this wouldn't work anymore?
Right. But currently it doesn't. If that changes in the future, the script needs to be updated.
Yes. But in case that happens, the script will break, I guess. I suppose it's not a priority.
Make all
href
absolute (referring towww.sozialministerium.at
) so that the versioned pages can be viewed directly.Easier: create a
base
tag in thehead
: