chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.79k stars 420 forks source link

[Feature Request]: Addition of basic `param string` methods into the standard library. #25582

Open Iainmon opened 3 months ago

Iainmon commented 3 months ago

Summary of Feature

Description: I have been doing some string parsing at compile time and want to suggest these functions to be included in the standard library. I will open a PR if they all look admissible.

Code Sample

proc param string.this(param start: int, param stop: int) param do
    return this.slice(start,stop);

proc param string.slice(param start: int, param stop: int, param idx: int = start) param {
    compilerAssert(start <= stop);
    compilerAssert(stop <= this.size);
    if start <= idx && idx < stop {
        return this[idx] + this.slice(start,stop,idx + 1);
    } else {
        return "";
    }
}

proc param string.take(param count: int) param do
    return this.slice(0,count);

proc param string.drop(param count: int) param do
    return this.slice(count,this.size);

proc param string.countOccurrences(param c: string, param idx: int = 0) param {
    if idx == this.size {
        return 0;
    } else if c == this[idx] {
        return 1 + this.countOccurrences(c,idx + 1);
    } else {
        return this.countOccurrences(c,idx + 1);
    }
}

proc param string.takeUntil(param del: string, param idx: int = 0, param keepDel: bool = false) param {
    if this[idx] == del {
        if keepDel then
            return this[idx];
        else
            return "";
    } else {
        return this[idx] + this.takeUntil(del,idx + 1);
    }
}
e-kayrakli commented 3 months ago

Thanks for the request, Iain! Some docs could help solidify your proposal here.

First some specific comments:

But I have some high-level ones as well:

At times we talked about a package/mason module like StringUtil. Especially take and drop sounds like they could be suitable for something like that. The reason I am not too keen on having them in the standard library is because what they do can be achieved with slicing (ignoring param part for a second, which I see as lack of implementation):

myString.take(3) == myString[..2] == myString[0..#3] == myString[..<3]
myString.drop(3) == myString[3..]

Maybe methods instead of slicing is more self-documenting. I am not sure if I believe that, but even if so, I am not sure if it is enough of an advantage to be in the standard library (vs. a package/mason one).

I am against proc this(start, end). It is too much of a workaround for the lack of param ranges. Something like that could appear in an application if you want to, but even that feels like a bad practice to me, as it would lead to patterns like myString[3,5] which is not really clear to me. Especially considering what you want that to perform is actually representable in the language via myString[3..5], albeit losing the paramness.

Depending on your need for compile-time processing, we could consider implementing bounded param ranges.

Iainmon commented 3 months ago

Thank you Engin. I think you are right and waiting for param ranges would be a better option. As for your questions,

e-kayrakli commented 3 months ago

The slicing function is just normal slicing at compile time

Yeah, but this is what you use for slicing, right. proc slice is more of a helper? Unless I am missing what idx could be used for.

I was using these for parsing param strings and found them quite useful. But after thinking about its, they do seem sort of application specific and strange.

I can see them being useful. And I wouldn't really call them strange. Those (and their non-param versions) could make a good StringUtil package/mason module that we can grow further as more similar things come along.

mppf commented 3 months ago

Do these match things on non-param strings? IMO having param versions of some string functions is a great idea. That said, I think (?) these are different from existing non-param string operations and so perhaps go into a StringUtil or similar module.