macgitver / MacGitver

MacGitver - THE swiss army knife for Git!
8 stars 0 forks source link

feature: shell macroparser #12

Closed antis81 closed 10 years ago

antis81 commented 10 years ago

This parser is planned to perform bash style macro parsing and can be used for "styling" and "scripting" tasks within MGV. There are chances it evolves to a full featured parsing library, but currently the purpose is to "simply get the job done" :100:.

See also 'MacGitver/MacGitverModules -> PR81'.

antis81 commented 10 years ago

:exclamation: :notebook_with_decorative_cover: Re-adding the comment from @scunz:

Okay, here's a subset of the bash parameter expansion that I think is worth to support. In all of these, parameter is subject to the replacement logic itself.

Any matching here refers to the (actually simplified) path name expansion method of bash:

Having that said, I can't await to see it implemented. I think, the "Pathname expansion", "Replacement logic" and "Parameter expansion" should all be separate classes. Actually, I even think that these would be worth for a library on their own :)

antis81 commented 10 years ago

Removed everything concerning the parser, because of a license conflict. @scunz Can we use Apache license for this code parts? If not we cannot use the code and need to rewrite the parser from scratch.

scunz commented 10 years ago

Heho, @antis81 and @claus007.

Please forgive me my ignorance and possible arrogance with respect to the above done work. I've done a half way working example of what I actually had in mind.

To make it clear: If we can answer the questions below positive, then I am all in favor of using the reg exp based approach above. Just because a regular expressions based parser is probably easier to understand for a new commer.

What I was basically thinking is: We actually don't need a parser. What we need looks more like something that lex would spit out. So, please have a look at this code snippet and let's start a discussion on that.

// main.cpp
#include <QCoreApplication>

#include <QStringList>
#include <QString>
#include <QChar>
#include <QHash>

class ShellExpand {
public:
    typedef QHash<QString, QString> Macros;

    ShellExpand(const Macros& macros)
        : mMacros(macros)
    {
    }

private:
    enum Mode {
        // "Normal"
        PlainText,

        // $foo
        SimpleVarRef,

        // ${ ParamPart CommandPart ArgumentPart }
        ParamPart,
        CommandPart,
        ArgumentPart
    };

    struct State {
        Mode        mode;
        QString     input;
        int         recur;
        int         save;
        int         pos;

        QString get() {
            return input.mid(save, pos - save);
        }

        void doSave() {
            save = pos;
        }

        QChar cur() const {
            if (pos < input.length())
                return input.at(pos);
            else
                return QChar();
        }

        State(const QString& in)
            : mode(PlainText)
            , input(in)
            , save(0)
            , pos(0)
        {}
    };

    static inline bool isVarChar(QChar ch) {
        // verify this is right
        return ch.isLetterOrNumber() || ch == L'_';
    }

public:
    QString apply(const QString& input) {
        static QString cmdChars = QLatin1String(":#/%+-=");

        State s(input);

        QString output, partParameter, partCommand, partArgument;

        do {
            switch (s.mode) {
            case PlainText:
                if (s.cur() != L'$') {
                    s.pos++;
                }
                else {
                    output += s.get();
                    s.pos++;
                    s.doSave();
                    if (s.cur() == L'{') {
                        s.mode = ParamPart;
                        s.pos++;
                        s.doSave();
                    }
                    else {
                        s.mode = SimpleVarRef;
                    }
                }
                break;

            case SimpleVarRef:
                if (isVarChar(s.cur())) {
                    s.pos++;
                }
                else {
                    output += replacementLogic(s.get());
                    s.pos++;
                    s.doSave();
                    s.mode = PlainText;
                }
                break;

            case ParamPart:
                if (s.cur() == L'}') {
                    output += replacementLogic(s.get());
                    s.pos++;
                    s.doSave();
                    s.mode = PlainText;
                }
                else if (isVarChar(s.cur())) {
                    s.pos++;
                }
                else {
                    partParameter = s.get();
                    s.doSave();
                    s.mode = CommandPart;
                }
                break;

            case CommandPart:
                if (cmdChars.indexOf(s.cur()) != -1) {
                    s.pos++;
                }
                else {
                    partCommand = s.get();
                    s.doSave();
                    s.recur = 0;
                    s.mode = ArgumentPart;
                }
                break;

            case ArgumentPart:
                if (s.cur() == L'}') {
                    if (s.recur) {
                        s.recur--;
                        s.pos++;
                    }
                    else {
                        partArgument = s.get();
                        output += replacementLogic(partParameter, partCommand, partArgument);
                        s.pos++;
                        s.doSave();
                        s.mode = PlainText;
                    }
                }
                else if (s.cur() == L'{') {
                    s.recur++;
                    s.pos++;
                }
                else {
                    s.pos++;
                }
                break;
            }
            if (s.pos == s.input.length()) {
                if (s.mode == PlainText) {
                    output += s.get();
                    return output;
                }
                // fucked up!
                return QString();
            }
        } while(1);
    }

private:
    QString replacementLogic(QString parameter, QString command = QString(), QString arg = QString()) {

        QString value = mMacros.value(parameter, QString());

        if (!command.isEmpty()) {
            if (command == QLatin1String(":-")) {
                if (value.isEmpty()) {
                    value = apply(arg);
                }
            }
            else if (command == QLatin1String(":=")) {
                if (value.isEmpty()) {
                    value = apply(arg);
                    mMacros[parameter] = value;
                }
            }
            else if (command == QLatin1String(":+")) {
                if (!value.isEmpty()) {
                    value = apply(arg);
                }
            }
            else if (command == QLatin1String(":")) {
                QStringList sl = arg.split(QLatin1String(":"));
                if (sl.count() == 1) {
                    value = value.mid(sl[0].toInt());
                }
                else if (sl.count() == 2) {
                    value = value.mid(sl[0].toInt(), sl[1].toInt());
                }
                else {
                    value = QLatin1String(">Error: Bar arg count<");
                }
            }
            else {
                value = QLatin1String(">Error: Unknonwn command<");
            }
        }

        return value;
    }

    Macros mMacros;
};

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);

    ShellExpand::Macros m;
    m["Foo"] = "FooBar";

    ShellExpand se(m);
    QString s = se.apply("${verb:=bazing} ${Foo:0:3} ${Bar:=${Foo:3}} the ${Bar} to the ${Foo}! ${verb}");
    qDebug(qPrintable(s));
    return 0;
}

So here's some questions:

scunz commented 10 years ago

Uhm. Apache license? Let me check that... But IIRC, this was the only reason, I've not even started integrating the Picture-Compare stuff. But let me revisit that.

scunz commented 10 years ago

Well, I personally would not be having a problem with including Apache2.0 code with our GPL v2 code. However, section 4 of the Apache License clearly tries to impose additional restrictions onto the derivative work. This is actually not allowed by any GPL/LGPL. So, legally this might indeed be a problem. But I'm not a lawyer and I didn't cross check what exactly the GPL demands (thus did not subtract the intersection from the Apache restrictions, which would actually yield an definitive answer).

claus007 commented 10 years ago

Hi guys,

Am 25.11.2013 01:24, schrieb Sascha Cunz:

Well, I personally would not be having a problem with including Apache2.0 code with our GPL v2 code.

I agree, IMHO using the license that way should work.

However, section 4 of the Apache License clearly tries to impose additional restrictions onto the derivative work. This is actually not allowed by any GPL/LGPL. So, legally this might indeed be a problem. But I'm not a lawyer and I didn't cross check what exactly the GPL demands (thus did not subtract the intersection from the Apache restrictions, which would actually yield an definitive answer).

I am not and I don't want to be a lawyer - too :-) and I am not interested to get you into any trouble.

But it was my decision to publish under the apache license and I beg you to respect it.

Actually I chose this license because it guarantees the software and usage to be really free. By the way that was the reason I did not choose the gpl because it has (much) more implications on the derivative work (!).

:-) I really like writing such small finger jobs and I like supporting you for your free software ! But of course you're also free to write everything on your own... ;-)

Think about, as I already told you I like supporting you! So please don't let this cooperative work fail on some odds like a different license!

cheers Claus

scunz commented 10 years ago

The ASF itself clearly states (by referring to the FSF) that it is not possible to accept Apache License covered code into a GPL v2 project. This covers the intuition I had when I was reading the Apache License yesterday, though the ASF's statement is of course of far more weight and importance than mine.

Though, even if I think [1] that your code actually is licensed under either LGPL 2.1 or GPL 3.0, we might be able to use it. However, this does not respect at all what you actually wanted; so this is out of scope in my opinion.

I am as well sorry as sad about this situation. Esp. since we cannot change the license away from GPL 2. We already have to make use of Qt under the LGPL2.1, which actually grants us the right to use Qt under the GPL V2 rather than the GPL V3 license. Together with the GPL2 code we are using from libgit2, all our 3rd parties are actually GPL'ed and we would lose a lot trying to turn that around.

After all, I'm not dissatisfied with the GPL v2. Its only "real" implications are: Keep the code and any advancements to it free and do not let any third party impose any restrictions. It does not require the author to legally assure that a potential(!) user is actually able to legally use it (which is exactly how I read the Apache License now).

Or in other words, the following would be possible:

Given this possibility, I can only urge anybody using the Apache License (Even the ASF itself) to rethink if this is actually a good choice.

Thanks and sorry again.

[1] Has to be linked against the Qt library to become usable at all, thus if you were to provide us a binary of that code, the source to produce it could only be licensed GPL3 or LGPL2.1 and as a coauthor of both qt and libgit2, I could demand you to give me the source code. Which in turn would give us the right to just use it under GPL3/LGPL2.1.

antis81 commented 10 years ago

Though, even if I think [1] that your code actually is licensed under either LGPL 2.1 or GPL 3.0, we might be able to use it. However, this does not respect at all what you actually wanted; so this is out of scope in my opinion.

That was true for myself when I started MsPiggit and dropped it later on because of switching over to MGV. @claus007 is using Apache V2 license. Sorry I forgot about that in this code and brought the discussion up. But it is very important we respect each others work. I think using GPLv2 makes that possible and is enough for our purposes.

I am as well sorry as sad about this situation. Esp. since we cannot change the license away from GPL 2. We already have to make use of Qt under the LGPL2.1, which actually grants us the right to use Qt under the GPL V2 rather than the GPL V3 license. Together with the GPL2 code we are using from libgit2, all our 3rd parties are actually GPL'ed and we would lose a lot trying to turn that around. ... Given this possibility, I can only urge anybody using the Apache License (Even the ASF itself) to rethink if this is actually a good choice.

Thanks for clearing and taking the time of really reading through this stuff carefully. @claus007: You're a very sophisticated developer. Don't you think you can "jump over your shaddow" and respect MGV's choice of license?

claus007 commented 10 years ago

Grrrr.... I see the problem ... still don't like the GPL....

Okay, alternatives ???

Let's say I would develop software for your project under the GPL ... How can I tell my QtCreator to use different license files for different projects ???

Alternatively, what about the BSD or MIT License would this suit you better ?

claus007 commented 10 years ago

Sorry folks,

really thought about it the last two hours - the parser I wrote is an excellent example of my problem... I wrote it for you, but now I want to turn it into a C++ parser/highlighter.... for official use and therefore I want to have it licensed under "a real free" license... (I also want to use namespaces)

And I can't publish it with two licenses and I don't want to use GPL...

Sorry guys I really regret this!!! But I am out.

Best wishes for the future and success for your program !

Claus

antis81 commented 10 years ago

How can I tell my QtCreator to use different license files for different projects ???

Don't think this is possible by now. I do it manually - sorry. This feature only works for simple structured monolithic projects - which makes it really obsolete. See here: https://bugreports.qt-project.org/browse/QTCREATORBUG-5783

scunz commented 10 years ago

@claus007, sorry to hear that.

@antis81, given the new situation, do you care to take my code and finish it?

antis81 commented 10 years ago

Well, also I don't like that decision I think there's no other choice. I'll do that and move on like always.

scunz commented 10 years ago

I'm thinking about an addition to the rules. Something along the syntax of ${&:/foo/bar} and ${&-:/foo/bar}.

The command being the & or &- here and it's meaning would be to return the content of file :/foo/bar after shell expansion has been applied to it. The dash would instruct it to just read and process the file but actually return an empty string.

This would mean, we could create a file inside the qt resources, that would look like:

${mgvBackgroundColor:=white}
${mgvForegroundColor:=black}

This file could be read with the &- command from every template and would just set some default values for us...

scunz commented 10 years ago

@antis81 maybe a little related to this, did you see https://github.com/scunz/pas2c?

The story behind this code is: I was shown some legacy pascal code that should have been transformed to C/C++. I said that I could write a simple Pascal parser over the weekend. Well, I tried, just for the sake of it. And this repository is how far I got. I think, for a weekend of work, it is pretty far and impressive. So, if we ever need a scripting language.... :imp:

antis81 commented 10 years ago

@antis81 maybe a little related to this, did you see https://github.com/scunz/pas2c?

Yeah, I've seen you working on that. Really impressive. I'm not too pleased about the idea though, adding Pascal as a scripting language - a question of taste, I know :smile:.

scunz commented 10 years ago

Well, I don't think that I would want to write Pascal code myself nowadays. That's something I did 25 years ago. But after having had a deep look at it again, I still must say: I like it much more than the crappy EMCAScript aka JavaScript that everybody nowadays feels to be married to...

scunz commented 10 years ago

Looking very good now. Has a clean history. Nice work.

However, I cannot resist to mention what a naughty boy you are. How can you dare use a force push... Just imagine what will happen to you when someone from the unnamed other project will see that :snake:

antis81 commented 10 years ago

However, I cannot resist to mention what a naughty boy you are.

Well, we know what we are doing, right? :smile: Btw. it would be naughty to not write a comment about it and wait until someone cries :baby_bottle: :smile:. But even then it would not be unrecognized and nothing bad would happen.

Here comes the next one :smile: ... wait for it.

scunz commented 10 years ago

Well, we know what we are doing, right? :smile: Btw. it would be naughty to not write a comment about it and wait until someone cries :baby_bottle::smile:. But even then it would not be unrecognized and nothing bad would happen.

Here comes the next one :smile: ... wait for it.

Nothing to add. Looking forward to it :)

scunz commented 10 years ago

You don't like it when I merge the opening brace to the previous line inside a function? :)

antis81 commented 10 years ago

You don't like it when I merge the opening brace to the previous line inside a function? :)

I did it also in the past, but after having seen so many style guides - each doing different - I don't actually care. It just helps a little in reading VCS's history when moving it to a separate line :smile:. So, for me that is the only reason. Are you using a prettifier on the code, btw. :)?

scunz commented 10 years ago

Are you using a prettifier on the code, btw. :)?

No. Everything I do is hand crafted art :) This includes any deviations and logical errors :smile:

The problem is actually the same as Claus mentioned: I am involved with a bunch of projects. This one uses that style and the next one another and the third is different from both. I just try to match the ones that I can influence to the majority. i.e. my current style for MGV is pretty much the libgit2 style and shares some aspects with the style in Qt and some other projects.

I also have to deal with code that looks like:

void main()    {
    std::shared_pointer<
        SomeNamespace::SomeClass
    > =
        SomeNamespace::SomeClass::spCreate(
            x
            ,
            y
       )
    ;
}

But actually, i don't want to comment on that :-(

antis81 commented 10 years ago

I'm thinking about an addition to the rules. Something along the syntax of ${&:/foo/bar} and ${&-:/foo/bar}. ...

This introduces injection logic and I'd rather avoid that. In the example we populate the macro hash with that. We can do that better by using the new QString apply(QIODevice&):

QString input; // the input string
ShellExpand se(ShellExpand::Macros());
se.apply(QFile(":/mgv/defs.shm");
QString output = se.apply(input);

Content of the file :/mgv/defs.shm:

${mgvBackgroundColor:=white}
${mgvForegroundColor:=black}

Hm, what we still need is a mechanism to get a ShellExpand::Macros hash from the MGV::Config class. What I'm thinking about is converting the QSettings path into underlined vars and add some MGV_SETTINGS_ prefix.

Let's assume we define a common background color (white) for our views views/background-color. Now we turn that into an underlined string:

QHash<QString, QString> Config::toMacroHash()
{
    QString varName = "MGV_SETTINGS_" + "views/background-color";
    varName.replace( QRegExp("[^\\w0-9]", QLatin1Char('_')) );
    QHash<QString, QString> macros;
    macros[varName] = value;
}

Usage in css:

/* these are suggestions, not real world examples */
table {
background-color: ${MGV_SETTINGS_views_background_color}
}
#branch{
background-color: ${MGV_SETTINGS_branch_background_color_active}
}

@scunz What do you think about that?

scunz commented 10 years ago

What do you think about that?

First of all, I think that we should rename the two apply methods into expandText and expandFile; both should be taking just a string where expandFile would read the file. We can actually assume that the file is encoded in utf-8, because anything else would be really stupid, so expandFile could boil down to:

QString expandFile(QString name) {
    QFile f(name);
    if (!f.open(QFile::ReadOnly)) return QString();
    return expandText(QString::fromUtf8(f.readAll()));
}

Second, while I am not against using fileExpand as replacement for the &- idea, I would really like to hear why this is better.

And finally, for several reasons, I am not very fond of the idea to get the macros from the settings. Well, actually, we have to get them from the settings. BUT, this will be far from an automatic process. The values in the settings are stored in QVariants and we need textual processing for them to be suitable to be used in CSS. And also note that not everything suitable for CSS is also suitable for HTML. i.e. <font color="rgb(127,127,127)"> is not allowed while .foo { background-color: rgb(255,255,255); } is pretty valid css. Not to mention that at some point we might want introduce automatic wrapping of macros to be usable inside javascript.

Further, colors will not at all come from the settings.

Generally, I have the feeling that we've just invented a really cool and expandable way to do things and are now reducing it to be able to more easily handle it. That was not the intention. The intention actually was to do more complex things more easily.

antis81 commented 10 years ago

Added the logic for processing external files. Should work like expected. Just one question: This introduces a chance of running into macro name conflicts. At present, the macro is simply replaced (last wins). I'd suggest to introduce something like this:

void ShellExpand::setForceDeclMacros(bool enabled) { mForceDeclMacros = enabled; }
bool ShellExpand::addMacro(const QString& name, const QString& value)
{
    if (!mForceDeclMacros && _macros.contains(name))
        return false;
    _macros[name]  = value;
    return true;
}
scunz commented 10 years ago

How can a macro be replaced by the := command? Actually, we don't have a command to explicitly set a macro to a value. So I'm not concerned by this :smile:

antis81 commented 10 years ago

FYI: Sorry, made a mistake somewhere in the state logic and it does not work. Don't know what's wrong yet.

antis81 commented 10 years ago

Hm ... tried to outsource our private State class to seperate private header / cpp files. But this gives linker errors. Is this possible at all?

Sample ShellExpandPrivate.hpp:

#include "ShellExpand.hpp"

class /*NO EXPORT*/ ShellExpand::State
{
public:
  State();
  ...
  int mode() const;
};

Sample implementation ShellExpandPrivate.cpp

#include "ShellExpandPrivate.hpp"

void ShellExpand::State::State() {}

int ShellExpand::State::mode() { return 0; }

Sample usage in ShellExpand.cpp:

void ShellExpand::MyMethod()
{
  State s; // this seems to work ... hm ...

  int m = s.mode(); // cannot be resolved by linker ... why? It is within the same library.
}
antis81 commented 10 years ago

From somewhere above :)

FYI: Sorry, made a mistake somewhere in the state logic and it does not work. Don't know what's wrong yet.

Everythings good. The mistake was in the caller, which generated two instances of ShellExpand. So the hash became "empty". Aaargh ... another ~2h wasted time.

scunz commented 10 years ago

If I may, I want to suggest another feature that would fit very good into my plans:

  1. Add a command like ${parameter!word}
  2. Change Macros from QHash<QString, QString> to QHash<QString, QVariant>
  3. Change all usages of Macros to do an explicit toString() on the variant
  4. Add a class LoopCallback { public: virtual bool init(SHExpand*) = 0; virtual bool next(SHExpand*); };
  5. Add a hash of LoopCallback pointers to SHExpand
  6. When encountering a ! command, do the following:
    1. Create a backup of the current Macros; set "command" to "&-"
    2. Search the LoopCallback-List for a key named by parameter
    3. If not found, yield empty string
    4. if found, call init on the LoopCallback (passing this as argument)
    5. if init returned false, restore macros from backup and yield empty
    6. Set "result" to empty (QString)
    7. Forever:
      1. Invoke replacement logic, append returned value to "result"
      2. call next on the LoopCallback (passing this as argument)
      3. reset macros to backup
      4. if next returned false, yield "result"

I will implement that (later), if it's okay to do that :)

antis81 commented 10 years ago

If I may, I want to suggest another feature that would fit very good into my plans: ...

Feel free to do so :smile: :+1:

scunz commented 10 years ago

As for the outsourcing: Would have to see that code to tell. Sound weird.

Everythings good. The mistake was in the caller, which generated two instances of ShellExpand. So the hash became "empty". Aaargh ... another ~2h wasted time.

Well, finding the cause is actually good news, isn't it? Don't let me count up the times that I wasted with "finding my own idiotic mistakes" :) However, I still think that the two flush() invocations are wrong: here and here

scunz commented 10 years ago

You have written:

int mode() const;

but in:

int ShellExpand::State::mode() { return 0; }

the const is missing. So, unless you have 2 different declarations of State this should not even compile (or is just a copy+paste error)

antis81 commented 10 years ago

However, I still think that the two flush() invocations are wrong: here and here

Yes, I'm currently working at that place. What do you mean by advance? Isn't that our changeMode() call?

antis81 commented 10 years ago

the const is missing. ...

Ooops, I wrote the sample code on the fly in GH. So it just shows a stub of what it looks like. Should I commit the non-working version to separate branch? Maybe this won't work at all.

scunz commented 10 years ago

Yes, changeMode comes close to what is required here. advance in "typical" lexer/parser-speak means "goto the next token without interpreting it"

And yes, give me a branch and i'll look into it ^^

antis81 commented 10 years ago

Ok, branch has landed ^^: nf/temp/outsourced-shm-state

scunz commented 10 years ago

Hm, well, I am wondering why you are trying to split such tiny part of code into a separate translation unit at all. Since the merge of the RM branch, my notebook takes 5 minutes to do a single cored rebuild of all of the MGV code. As I am currently about to do some library splits in libHeaven and plan to add some additional modules, this will probably soon increase to somewhat between 10 and 15 minutes. I am not really looking forward to that.

But the only thing around that is to reduce the number of translation units...

scunz commented 10 years ago

Repository/ProgressDlg.cpp in Modules is another candidate to make usage of this. (But I think we should better integrate that code into the Core User Interface???)

antis81 commented 10 years ago

@scunz I'm a little stuck on how to get the recursion right ... Wrote a test wich results in endless recursion. I'll commit it here for discussion.

antis81 commented 10 years ago

@scunz Should we merge here? There's one outstanding feature you had in mind somewhere above.

If I may, I want to suggest another feature that would fit very good into my plans: ...

scunz commented 10 years ago

@antis81 actually, there's lots of features still missing from the original proposal, AFAIK. OTOH, this is enough for our needs for now. Let's just get it in...

antis81 commented 10 years ago

@scunz I think it is even more than just "fits our needs". The most important thing for me is, that it is easy extensible at this point. And that's given by now. Of course there's things missing like deep error handling and so forth. We could even access the recursion level to tell, that a closing bracket is missing. But I think we can address those features in the future and get the job done here. So like you said ... let's get it in and use it :exclamation:.

scunz commented 10 years ago

why restore the branch? :)

Uhm, well, it fits in as a really small and tiny piece into my big picture. Just to mention, at this point, I have roughly 15.000 lines of code aimed to be included in MGV someday. But for now these are just hanging around because either the code is not ready for the time or the time is not ready for the code :dog: