gistya / expandr

a cool shell script for git keyword expansion.
16 stars 8 forks source link

possible direction? #1

Open mikeserv opened 10 years ago

mikeserv commented 10 years ago

So, we talked about this the other day. I actually got back in the chat room last night, though I don't think you've seen it - (your profile indicates you haven't logged in since)...

Anyway, here is a really rough 1/4 way there sed implementation - (updated):

tab='       ' nl='
'; LC_ALL=C \
sed '/^#####/,/^#####/d
    /^###[^#]/,/^#['"$tab"' ]*$/!d
    /^#/{   /^###[^#]/!d
};  s/[]|\\$^*[]/\\&/g
    s/\([^ '"$tab#].*[^ $tab"'"#]\).*/\1/
    /^#[ '"$tab"'#]*\(.*\)/{ 
};  s/^[^"]*"\(.*\)/"|\1|/

I read your updated notes this morning - and I think I'm beginning to understand better. I'll be honest about this - my hope is that doing this would help me to learn to use git. I don't know how - and I'd really like to. This is probably obvious considering I posted this as an issue. While that may be true - it's not your issue but mine. Yours works - this doesn't... yet. I'm hoping we can help each other - because I really am very good with sed and portable shell-script, but not much else.

This update actually does less than before. It does not print working sed script. It parses a config file and prints something very near to sed script that should be easily transformable depending on intended action. Given the following input:

        "fi|ll* in with real value"
        "fill in with real value"
        "fill in with real value"
        "fill in with real value"
#^this is the AWS application name^#
#LIVE_APPLICATION_NAME[2]="/fill in walue/!"

        "fill in with real value"
        "fill in with real value"
        "fill in with real value"
        "fill in with real value"

It prints...

"|fi\|ll\* in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|

Basically it wants input like:

     "replace value1"
       "replace value2"

where SECTION replaces the first @ character on any _SEARCHVALUE line. On ###SECTION### and SEARCHVALUE lines all whitespace is removed. There can be multiple "replace value"_ lines per _SEARCHVALUE. On "replace value" lines all characters before the first and following the last " are removed So the leading whitespace there doesn't matter. Blank lines are ignored. In fact, all lines not beginning w/ 3 hashes, or containing a @ or at least two " are ignored. All lines, that is, but a line beginning with a single # and containing any amount of trailing whitespace. Those lines serve to delimit each SECTION Oh, and for SECTION lines the trailing hashes are optional - they don't matter. Any line not within a ###SECTION through ^#$ range is ignored - even if it happens to match one of those 3 types of lines that aren't ignored.
And any lines within that range that begin with a hash are ignored as well - even if they contain two or more quotes and/or a @.

Probably you can see where my fatal flaw was though with the (SEARCH/replace) labels - those need to be interchangeable. I believe I am working toward that.

I am very interested to know what you think though - as, again, my understanding it admittedly limited. Do you think it could be a workable direction?

My roughly imagined workflow looks like:

  1. shell accepts args and calls first sed/grep to select only relevant SECTION ranges in config file.
  2. those config lines are passed to this sed which translates its input into working sed script
  3. last sed reads it as a script and performs file edits as necessary.
gistya commented 10 years ago

Sorry I've been quite busy with work. You've presented an interesting idea here. I have one question from my initial read-through:

            s||fi\|ll* in with real value|g

Why here does it say s||fi|ll* ...?


On Fri. 12 Sep.2014, at 7:03 AM, mikeserv wrote:

So, we talked about this the other day. I actually got back in the chat room last night, though I don't think you've seen it - (your profile indicates you haven't logged in since)...

Anyway, here is a really rough 1/4 way there sed implementation:

tab='\t' nl=' '; LCALL=C \ sed -n ' /^#####/,/^#####/d /^###[^#]/,/^#['"$tab"' ]$/!d /^#/{ s/^###['"$tab"' #]// s/['"$tab"' #]$// /^[^#]/h;d }; /^[^"]@/{ G s/['"$tab"' ]//g s/@(.)\n(.)/\2\1@|{/ s/^/'"$tab"'|@/p;d }; /^[^"]"/{ s/// s/|/\&/g s/(._)".*/'"$tab$tab"'s||\1|g\'"$nl"'}/p }'

I only just read your updated notes - and I think I finally understand now. When writing that I didn't consider that the values would have to go both ways, and so I guess this only smudges at the moment - though it doesn't even do that. It only prints working sed code that would smudge. For example, given the following input...


@AWS_ACCESS_KEY_ID "fi|ll* in with real value" @AWS_SECRET_ACCESS_KEY "fill in with real value" @MERCHANT_ID "fill in with real value" @APPLICATION_NAME "fill in with real value"

^this is the AWS application name^


LIVE_APPLICATION_NAME[2]="/fill in walue/!"



@AWS_ACCESS_KEY_ID "fill in with real value" @AWS_SECRET_ACCESS_KEY "fill in with real value" @APPLICATION_NAME "fill in with real value" @MERCHANT_ID "fill in with real value" #

It prints...

            s||fi\|ll* in with real value|g

} |@LIVE_AWS_SECRET_ACCESS_KEY@|{ s||fill in with real value|g } |@LIVE_MERCHANT_ID@|{ s||fill in with real value|g } |@LIVE_APPLICATION_NAME@|{ s||fill in with real value|g } |@SANDBOX_AWS_ACCESS_KEY_ID@|{ s||fill in with real value|g } |@SANDBOX_AWS_SECRET_ACCESS_KEY@|{ s||fill in with real value|g } |@SANDBOX_APPLICATION_NAME@|{ s||fill in with real value|g } |@SANDBOX_MERCHANT_ID@|{ s||fill in with real value|g }

Basically it wants input like:


@SEARCH_VALUE1 "replace value1" @SEARCH_VALUE2 "replace value2" #

where SECTION replaces the first @ character on any SEARCH_VALUE line. On ###SECTION### and SEARCH_VALUE lines all whitespace is removed. There can be multiple "replace value" lines per SEARCH_VALUE. On "replace value" lines all characters before the first and following the last " are removed So the leading whitespace there doesn't matter. Blank lines are ignored. In fact, all lines not beginning w/ 3 hashes, or containing a @ or at least two " are ignored. All lines, that is, but a line beginning with a single # and containing any amount of trailing whitespace. Those lines serve to delimit each SECTION Oh, and for SECTION lines the trailing hashes are optional - they don't matter. Any line not within a ###SECTION through ^#$ range is ignored - even if it happens to match one of those 3 types of lines that aren't ignored. And any lines within that range that begin with a hash are ignored as well - even if they contain two or more quotes and/or a @.

Probably you can see where my fatal flaw is though with the (SEARCH/replace) labels - those need to be interchangeable. I believe I can make it do just that though with another sed function or two.

I am very interested to know what you think though - as, again, my understanding it admittedly limited. Do you think it could be a workable direction?

My roughly imagined workflow looks like:

• shell accepts args and calls first sed/grep to select only relevant SECTION ranges in config file. • those config lines are passed to this sed which translates its input into working sed script • last sed reads it as a script and performs file edits as necessary. — Reply to this email directly or view it on GitHub.

mikeserv commented 10 years ago

Thanks for replying! I was afraid I had offended and so didn't pursue it further. In any case - that's just to demonstrate that the sed | divider is always escaped. Same goes for the * afterward. I thought that was the intent.

You know what though? I made that edit afterward and didn't include the literal bit in the correct input section... I'll edit it now to show it...

And, while I think that all of these could be done with a g which would hopefully obviate the problem, if you're still interested in knowing how I might handle the replacement if the replace string contains it, I did an answer here today that demonstrates the method - which mainly involves inserting \newlines before every possible match and then removing them before the final printout.

mikeserv commented 10 years ago

I brought it along some today. Here's a link to a working copy. Well, working to some point anyway.

As of now I can:

 ./script -aLIVE,SANDBOX


./script --accounts LIVE,SANDBOX

and get...

"|fi\|ll\* in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
"|fill in with real value|

Or just one or the other for only the relevant sections. In its current form it doesn't distinguish because it's looking for section headings, so...

./script -aDB


"|fill in with real value|
"|fill in with real value|
"|fill in with real value|

Each of those lines is a positional parameter - the first line in $1 and the second in $2 and so on. And so to reverse their order one might do...

 while [ -n "${2+?}" ]
 do printf %s\\n "$2" "$1"
 shift 2; done

And then when I do...

./script -aDB

I get....

"|fill in with real value|
"|fill in with real value|
"|fill in with real value|
mikeserv commented 10 years ago

I brought it further still. It now ensures its arguments are uppercased - it's a little native shell function I had from something else. It also translates long options to short every time. This is good because that we can let getopts do the option parsing - it's pretty good at it - and you can use either or... So, for instance...

./script --accounts live,SANDBOX --action clean
./script -alive,sandbox --action CLEAN
./script -alive,sandbox -Ac

...can all work the same. getopts also has a help menu you can implement - but I haven't done it. I don't even know what most of the stuff does yet. I have brought it back around to printing working sed code - only now it should do something like the right thing for either clean or smudge actions.

./script -alive -Asmudge

        s||fi\|ll\* in with real value|g
        s||fill in with real value|g
        s||fill in with real value|g
        s||fill in with real value|g


./script -alive -Ac

\|fi\|ll\* in with real value|{
\|fill in with real value|{
\|fill in with real value|{
\|fill in with real value|{