VerbalExpressions / implementation

In this repo we will document how each method in the library should behave
34 stars 2 forks source link

Default regular expressions #5

Open psoholt opened 11 years ago

psoholt commented 11 years ago

In CSharp we have made some common regex expressions, like e-mail and url. So e.g. one can write VerbEx something like:

verbEx.StartOfLine().Then(CommonRegex.Email);

I think this should be implemented similar across the different languages.

It's important that if we make something like default regex words like e-mail and url, that the underlying regex is equal between the different language ports.

Other examples of commonregex that could be implemented:

email phone url date ip address rgb color hex value decimal number time format

See original issue in CSharp: https://github.com/VerbalExpressions/CSharpVerbalExpressions/issues/4

brudgers commented 11 years ago

A thought provoking post, Peder.

For me, this sort of gets at the heart of the question, "What is the purpose of VerbalExpressions?"

And I see two reasonably valid answers.

The first is that VerbalExpressions are supposed to be a collection of prepackaged regular expressions - e.g. phone-number, etc. There's a practical case for this and an endless number of tasks to implement.

The second answer is that that VerbalExpressions are intended to be a more readable way of writing regular expressions, e..g "withAnyCase" instead of "i" and modifier syntax. In this case, there are a finite number of tasks to implement - only those which make VerbalExpressions isomorphic with regular expressions.

For me, the more interesting approach is the second. That doesn't make it better. It doesn't necessarily make it appropriate for the name "VerbalExpressions" either.

All it means is that, that is the direction I'm going with it, and if I need to change the name of my implementation to "VerboseExpressions" that's ok - a rose is a rose.

Ben

On Mon, Aug 19, 2013 at 4:58 AM, Peder Søholt notifications@github.comwrote:

In CSharp we have made some common regex expressions, like e-mail and url. So e.g. one can write VerbEx something like:

verbEx.StartOfLine().Then(CommonRegex.Email);

I think this should be implemented similar across the different languages.

It's important that if we make something like default regex words like e-mail and url, that the underlying regex is equal between the different language ports.

Other examples of commonregex that could be implemented:

email phone url date ip address rgb color hex value decimal number time format

See original issue in CSharp: VerbalExpressions/CSharpVerbalExpressions#4https://github.com/VerbalExpressions/CSharpVerbalExpressions/issues/4

— Reply to this email directly or view it on GitHubhttps://github.com/VerbalExpressions/implementation/issues/5 .

Foxboron commented 11 years ago

I disagree with calling it "Verbose", we are essentially just making regex verbal, where you can "speak" or pronounce a regex without sounding wierd. It obviously means it will be verbose, but there is no need to even consider a name change. Even after going down the path to make predefines regexe's, there is no valid reason to change the name.

brudgers commented 11 years ago

Sorry for not being clear, Morten.

What I meant was that if the direction of Verbal Expressions was toward having phoneNumber and URL as primitives.

Then, I was fine with dropping the name "VerbalExpressions" from whatever work in a different direction I did. That direction being toward implementing regular expressions with a more readable isomorphic language with primitives such as matchAtLeastOnce or beginClass.

In language terms, what are the atoms of the standard implementation? Is it going to be like Scheme or Ansi Common Lisp?

And I am not saying that there is anything wrong with an implementation that is different from the one I am interested in.

Ben

On Mon, Aug 19, 2013 at 8:17 AM, Morten Linderud notifications@github.comwrote:

I disagree with calling it "Verbose", we are essentially just making regex verbal, where you can "speak" or pronounce a regex without sounding wierd. It obviously means it will be verbose, but there is no need to even consider a name change. Even after going down the path to make predefines regexe's, there is no valid reason to change the name.

— Reply to this email directly or view it on GitHubhttps://github.com/VerbalExpressions/implementation/issues/5#issuecomment-22879472 .

psoholt commented 11 years ago

I just thought it would be a good idea for people not being that used to Regex, having predefined regex. What we do is making it verbal, and I thought this would be the next step.

Which is more verbal.

If one should have some predefines, without using it as parameters, then it would create so many different methods, so that's why one have to write something with CommonRegex or similar (might be a better name):

metal3d commented 11 years ago

Note that Find(), Then() etc... accepts string that are quoted when append, so a standard regexp string passed to those methods will be escpated...

Allowing specific object as parameters append a complexity and results on bad performances. Because each method should check if the given parameter is a value or a component

Please, consider to participate on other questions that are (IMHO) priorities: Not() implementation, Or() enhancement... before extending functionnalities. I did some review on implementations, and I see that only 4 languages has implemented "captures". I really think that discussions must be prior on standardisation. But I can be wrong ;)

psoholt commented 11 years ago

@metal3d In CSharp it won't be any worse performance issue as it is method overloads, but that might be the case in the javascript and in other languages?

If you look at the CSharp code we don't have the problem with the methods being escaped when using these default enums, as these are overload methods telling not to escape.

I agree it should be prioritized to discuss Not() implementation, Or() implementation and captures, but that doesn't mean I can come up with a suggestion for feature. It could always stay here in the issues list for a while (instead of forgetting it ;)

I agree with @Foxboron answer to your question @brudgers. We are essentially just making regex verbal, where you can "speak" or pronounce a regex without sounding weird. So this is just a suggestion to make it easier in some cases. But of course that could be created as a separate project and then it could be possible to combine default regex from that project with VerbalExpressions-project if one like to use both.

I just thought it would make it even easier to write regex verbal and easy.

metal3d commented 11 years ago

@psoholt That was not exactly what I meant :) I said that making a test to know if argument is a generic rule or a value to append makes more operations and is not truly optimal. The second problem is that appending this generic rule should test if we have to "clean" the string to append or not, because "string values" are "quoted" on insert.

I really don't know if it's a good idea to append that kind of complexity.

psoholt commented 11 years ago

@metal3d It does make it more complex of course, but not overwhelmingly complex. This is the Maybe() implementeation e.g.:

    public VerbalExpressions Maybe(string value, bool sanitize = true)
    {
        value = sanitize ? Sanitize(value) : value;
        value = string.Format("({0})?", value);
        return Add(value, false);
    }

    public VerbalExpressions Maybe(CommonRegex commonRegex)
    {
        return Maybe(commonRegex.Name, sanitize: false);
    }
metal3d commented 11 years ago

As you can see, you are using a new argument. Some langage cannot use default value for argument (Cpp, Go, ...) and should make some work to accept this. You must append "sanitze" argument to the whole methods... each call will make a test to know if sanitize has to be applied. I think that a overriden class of VerbalExpression is better. We should keep VerbalExpression as simple as possible to let overrides to be efficient. This is my humble opinion.

metal3d commented 11 years ago

Note that if we have "Add" method public, we can call it directly:

ve.Find("name").Add(CommonRegex.Email)

This is not "verbally" correct... but it's very simple

metal3d commented 11 years ago

I'm sorry to send 3 comments, but I just realize that I'm not as clear as I want.

What I mean is that PCRE or other Regexp implementation has no "Common" list of expression. I think that VerbalExpression should be at the same abstraction level as PCRE. Then, developpers can extends library to have common methods

psoholt commented 11 years ago

@metal3d Ok, I see your point. I will create a different common project for some default expressions, and we will keep Verbal Expression simple.

metal3d commented 11 years ago

Ho... You can wait for other opinions. I'm not the boss :) Le 28 août 2013 11:26, "Peder Søholt" notifications@github.com a écrit :

@metal3d https://github.com/metal3d Ok, I see your point. I will create a different common project for some default expressions, and we will keep Verbal Expression simple.

— Reply to this email directly or view it on GitHubhttps://github.com/VerbalExpressions/implementation/issues/5#issuecomment-23401619 .