AOSC-Archive / autobuild3

AOSC OS package maintenance toolkit (version 3)
https://aosc.io
GNU General Public License v2.0
24 stars 17 forks source link

RPM space processing should go further to escaping. #40

Closed Artoria2e5 closed 9 years ago

Artoria2e5 commented 9 years ago

It seems that RPM %files are shell globs with RPM % macro expansion, which I should have mentioned quite a long time ago in.. maybe the QQ group.

In the poorly-written doc, I got the following script to get a file list:

rm -f $RPM_BUILD_DIR/filelist.rpm 
echo '%defattr(-,root,root)' >> $RPM_BUILD_DIR/filelist.rpm 
find $RPM_BUILD_ROOT/%{_prefix} -type f -print | sed "s!$RPM_BUILD_ROOT!!" |  perl -pe 's/([?|*.\'"])/\\$1/g' >> $RPM_BUILD_DIR/filelist.rpm 

# RPM: %files -f filelist.rpm

When we simplify it, perl -pe 's/([?|*.\'"])/\\$1/g' to the whole thing is actually the core of this escaping. So we got

transform(){ perl -pe 's/([?|*.\'\''"])/\\$1/g'; }
rpmfiles(){
        local _strlen_pkg="${#PKGDIR}"
# Question: Shall we just use -type f?
        find "$PKGDIR" | while read i; do
                [[ -d "$i" && ! -L "$i" ]] && continue
                grep -Fx "$i" "$SRCDIR/autobuild/conffiles" && echo "%config"
                echo "%defattr($(stat --printf %a,%U,%G "$i"),-)"
                echo "${i:_strlen_pkg}" | transform
        done
}
transform_test(){ echo "whoami**?I am a monster.\"\\|'" | transform; }

But this is sooooooo slow and introduces a perl dependency. BTW it's badly wrong since shells don't accept backslash-escapes in single quotes. The correct one is perl -pe 's/([?|*.\'\''"])/\\$1/g'.

First we saw a Perl RE, with a $1 which means the first matching group. We can rewrite it into sed -re "s/([?|*.\'"'"])/\\\1/g'. Then we have a different test result, with perl \| gets \\ and sed gets \\\|. When I take a look into DebuggerEx with ([?|*.\'"'"]), it seems that perl did one more step of escape and changed \' to '. The sed one seems to be correct.

Then we can eliminate the ERE dependency, by making it sed -e "s/[?|*.\'"'"]/\\&/g'. This is quite trivial.

And then we can make it shell.

transform(){
  # backslashes should be processed first
  # The doc guy seems to forgot the spaces.
  # If a line like a\ file\ with\ spaces works in RPM, then this is correct.
  local p j badchar=(\\ \" \' \| \* \? . ' ') # Wait, why literal dots?
  p="$(<&0)" # stdin
  for j in "${badchar[@]}"; do p="${p//$j/\\$j}"; done
  # RPM guys, again, forgot the % stuffs.
  echo "${p//%/%%}"
}
Artoria2e5 commented 9 years ago

Btw as I always do, I think all RPM doc is trash.

MingcongBai commented 9 years ago

Please go ahead and we will see with several packages.

  1. llvm
  2. 0ad

They all have ambiguous file names, with [ and ( that can get in the way of scripting...

Artoria2e5 commented 9 years ago

@MingcongBai Again you reminded me of those things! The rpm guys must have forgotten them again…

MingcongBai commented 9 years ago

@Arthur2e5 That's pretty unfortunate, but! but! Keep in mind that writing a whole list of files included in a package is just unnecessary, but at the same time I am pretty much unsure with how to handle them.

Simply a /* as in file list, then anything with special attributes and whatever goes behind that?

Artoria2e5 commented 9 years ago

@MingcongBai that is acceptable, but maybe we have to do something with the attributes.

MingcongBai commented 9 years ago

@Arthur2e5 What I am unsure as of now, is that with /*, if the attribute of that wildcard will be passed onto any file after it.

Artoria2e5 commented 9 years ago

Well, my dear readers, we posted something by mistake in #55.