list of needed string utilities

gronki commented 4 years ago

Many will disagree, but I think Fortran strings (character(len = :), allocatable) are pretty dope for a low-level language. However, lack of standard utilities to handle them is pain. Adding a few utilities would be an extremely easy way to highly increase the value of the language.

I suggest the following format for proposals:

name of the utility
short description
does it exist in other languages
proposed example of usage

gronki commented 4 years ago

split
Given a separator, splits the string into some form of array.

Most languages have some form of that utility. Example in Python:


In [1]: "i am a sheep".split()                                                                  
Out[1]: ['i', 'am', 'a', 'sheep']

In [2]: "i am a big sheep".split('big')
Out[2]: ['i am a ', ' sheep']

4. There are other possible ways for it to work, just showing one:

character(len = 16) :: arr(10) call split("i am a sheep", arr) call split("i am a big sheep", arr, delim = "big")

LKedward commented 4 years ago

Being a Fortran programmer, I like my programs to parse user config files as case-insensitive where possible. So most of my programs have an upperStr routine included.

upperStr(...)/lowerStr(...)
Convert a character string all to upper/lower case
yes

character(len = 11) :: myString, myStringUpper
myString = "Hello World"
myStringUpper = upperStr(myString)

rweed commented 4 years ago

I think a new intrinsic module (ISO_FORTRAN_STRINGS) was proposed for F202x that included split along with some other new string routines (extract, insert, replace etc.) Last time I looked on the J3 site it appears thought that only split has survived. I think this was proposed to bring some of the functionality in the old VARYING_STRINGS module into the mainline standard. I would be interested in hearing from the committee menbers on the status of this since its one of the few things I see proposed for F202x I would use immediately if available.

marshallward commented 4 years ago

A broader question may be whether Fortran should consider including a standard library. String manipulation would be a natural component of such a library.

rweed commented 4 years ago

See https://j3-fortran/org/doc/year/18/18-259r2.txt for the original proposal.

rweed commented 4 years ago

Oops. j3-fortran.org not j3-fortran/org

FortranFan commented 4 years ago

split

Readers may note Part 2 of the Fortran standard, "Varying length character strings" which is slated for deletion.

In response, there was a proposal by J3 to include certain intrinsic procedures to Part 1 of the Fortran standard: see US03 in that link.

However WG5 decided at the Tokyo meeting earlier this year to only consider SPLIT for Fortran 202X, here's where it stands at present: https://j3-fortran.org/doc/year/19/19-254r2.txt

milancurcic commented 4 years ago

A broader question may be whether Fortran should consider including a standard library. String manipulation would be a natural component of such a library.

Sorry, off-topic, but how is the current set of intrinsic procedures not a standard library? Sure, it's not called standard library in the standard, and it's available by default in the global namespace, but otherwise I can't tell the difference.

I use the term standard library throughout my book to refer to the set of intrinsic procedures and modules. This term is easier to understand to the broader audience.

marshallward commented 4 years ago

I would have said that a standard library could, for example, define a function that does not require an extension of the language itself, and could (in principle) be something that could be implemented in Fortran. In the case of string manipulations, it could help to delineate operations which are currently tedious vs impossible to implement.

But I definitely agree that this is potentially off-topic and I somewhat regret the comment. Apologies for the distraction.

sblionel commented 4 years ago

https://j3-fortran.org/doc/year/19/19-196r3.txt is the latest paper on this. There was some opposition to doing even this much. My own preference would have been to do more. The prevailing notion was that procedures that are straightforward to implement by users don't need to be intrinsics. I would like to see this revisited for 202Y.

On a semi-related note, see https://j3-fortran.org/doc/year/19/19-197r3.txt , which I worked on.

gronki commented 4 years ago

Thanks for the answer. It would be awesome if this could be revisited. The attitude that functions easy to implement should not be standardized is in my opinion absolutely detached from reality and outrageous. Keeping my fingers crossed for your (and others) efforts to push though improvements in Fortran string handling!

sob., 23 lis 2019, 01:36 użytkownik Steve Lionel notifications@github.com napisał:

https://j3-fortran.org/doc/year/19/19-196r3.txt is the latest paper on this. There was some opposition to doing even this much. My own preference would have been to do more. The prevailing notion was that procedures that are straightforward to implement by users don't need to be intrinsics. I would like to see this revisited for 202Y.

On a semi-related note, see https://j3-fortran.org/doc/year/19/19-197r3.txt , which I worked on.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/96?email_source=notifications&email_token=AC4NA3MNNH6C3AXEI3OYPWLQVB3G7A5CNFSM4JQQTFP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE7IBHA#issuecomment-557744284, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4NA3KLHCPZ5ESQIH4LO3TQVB3G7ANCNFSM4JQQTFPQ .

FortranFan commented 4 years ago

@sblionel wrote:

https://j3-fortran.org/doc/year/19/19-196r3.txt is the latest paper on this. ..

As I commented in https://github.com/j3-fortran/fortran_proposals/issues/96#issuecomment-557560514, the current state of development of a new SPLIT intrinsic appears instead to be this.

FortranFan commented 4 years ago

@sblionel wrote:

.. There was some opposition to doing even this much. My own preference would have been to do more. The prevailing notion was that procedures that are straightforward to implement by users don't need to be intrinsics. I would like to see this revisited for 202Y. ..

Can't help but go OT: the basic need for a standard is for the widest group of practitioners to have

a standard name e.g., is a procedure to be named REMOVE or ERASE; SPLIT or STRTOK or TOKENIZE, etc.!?
a standard interface e.g., subroutine or a function; what is the list, order, naming, and type, kind, rank of method parameters?
standard documentation and dissemination of method characteristics.

for the most commonly needed instructions.

Considering so much of information to be processed in the arena of scientific and technical computing is also in the form of strings, the utilities for string manipulation are of foremost importance.

That such an elementary consideration got voted over yet again at Fortran meeting #219 places Fortran at such a disadvantage. The question remains: For Whom Fortran?

jacobwilliams commented 4 years ago

Wow, these SPLIT proposals are really... um... requiring of more time in the oven... So, it's going to return the tokens with a bunch of extra padding spaces? All because we can't have a decent string class? That is just terrible. And to get even this is requiring years of debate? Is this really the best we can hope for?

How can we liberate Fortran from the Fortran committee?

sblionel commented 4 years ago

You could join the committee, or at least submit some proposals that flesh out your ideas.

That SPLIT proposal was the work of a day or two, and went through some rather major changes, A big part of the problem is the inability to have an array of different-length strings, but one can deal with that by trimming. To do more would require a major feature we're not going to do this time around.

certik commented 4 years ago

And to get even this is requiring years of debate? Is this really the best we can hope for?

How can we liberate Fortran from the Fortran committee?

@jacobwilliams When you write criticism, please follow the Code of Conduct. Trust me, everybody on the committee and especially @sblionel has heard such sentiment before and you will not achieve any change just be repeating it, the only outcome will be that the committee members will not want to participate here.

That being said, I agree with what you are (I think) trying to say. But the only way to improve things is to join the committee --- I asked you a few times and my invitation stands: please consider joining the committee, we need help. If you cannot join the committee, then the best way you can help is to constructively discuss things here and help draft proposals.

For this process to work, the committee must eventually do a few changes how it operates. One is to open up the discussion process and I think we have been very successful with this GitHub repository and the committee, as far as I can tell, is supportive of this effort. That's just the technical part, but it's a huge improvement. I know that you want a lot more changes and I do too, but again, we can sit and complain, or we can get to work on improving this process.

The next step is to make the committee's work transparent in how it decides which proposals get considered and to give feedback why a proposal was not accepted. I am working on this too, see #98.

See #97 for a discussion about standardizing "simple" things.

jacobwilliams commented 4 years ago

@certik Sorry, no offense was intended. My somewhat tongue-in-cheek comment wasn't intended to be a personal attack on anybody. I was only referring to the ISO process. I agree 100% with what you are saying, and applaud your efforts here.

jacobwilliams commented 4 years ago

Back to original topic. Here's one I use all the time:

string_replace
Replace all occurrences of one string with another
Yes, Python has this:
```
>>> 'aaaa'.replace('a','bb')
'bbbbbbbb'
```

Something along these lines:

character(len=:),allocatable :: str
str = 'aaaa'
call string_replace(str,'a','bb')
write(*,*) str ! writes bbbbbbbb

certik commented 4 years ago

@jacobwilliams no worries, thank you for contributing here!

FortranFan commented 4 years ago

@certik wrote:

.. That being said, I agree with what you are (I think) trying to say. But the only way to improve things is to join the committee --- I asked you a few times and my invitation stands: please consider joining the committee, we need help. If you cannot join the committee, then the best way you can help is to constructively discuss things here and help draft proposals. ..

@certik and everyone interested in advancing Fortran:

Please see a similar discussion thread on the unmoderated platform of comp.lang.fortran via this link and this one.

With respect to the point about joining the committee, please first consider the following:

What appears is a crucial difference between joining a working committee of a national body like the so-called "J3 committee" for US versus the international one of WG5 with respect to all the decision-making control and influence that is effectively held by WG5 with respect to the Fortran standard.
Readers need to keep in mind the J3 committee as a US national body can have >7 billion members from all over the world, practically all of humanity, and develop fully worked out proposals for all the features that one can ever contemplate for a programming language (Namespaces; Generics; Exception Handling; Standard Containers for strings, dictionaries, trees, etc.; Object-Oriented enhancements, Type-safe enums, and on and on) .
J3 can then present all those proposals at a WG5 meeting which presently comprises 3 national bodies of US, UK, and Japan only. But if the other two national bodies - usually one representative each - happen to think all these features are not of any interest to them for whatever reason (their focus on strict numerical performance only, compiler vendor reluctance, bad mood, luddite, etc.) , each and every one of those proposals can get rejected (or deferred to an unspecified future Fortran revision in year 20YZ) by a vote of 2-1. And which can effectively mean the vote of 2 people against the rest of humanity.
The bulk of feature developments occurs at physical 5-day meetings held by the J3 committee typically in Las Vegas NV on working days a few times a year (usually two). Re: this arrangement, consider the constraints expressed by @rweed in that comp.lang.fortran thread, "having to use our own personal time/money or beg our employers to support a trip to Las Vegas or elsewhere. If I told my boss I wanted him to pay for a trip to Las Vegas to participate in a Fortran standard committee meeting he would kick me out of his office."

Thus when readers such as @jacobwilliams are invited to join the committee and they accept the invitation, that can be remarkably beneficial to the quality and quantity of proposals by J3 but that might come about at possibly great personal cost (time and/or finances and/or relationships given the Vegas location) to these new members.

But all that effort can yet FAIL ENTIRELY to bear any fruit and can potentially make NO difference to the state of the Fortran language as constrained it is by the ISO IEC standard and whose content as well as the pace of change is completely controlled by WG5, as explained in point 3 above.

It can literally take DECADES and DECADES for the Fortran standard to have the simplest of features such as a string utility like SPLIT or a type-safe ENUM, facilities that pop up cleanly and efficiently in other languages such as Python (look at all their enhancement proposals), C++, C#, Julia, etc. in a matter of months or a year.

So how much time and effort can Fortranners afford to spend on developing this language? How long can the practitioners wait for the language to get the features they need in their coding?

In effect, the arrangement with Fortran and WG5 is not all that different - philosophically speaking - from trying to join UN to achieve world peace.

Considering all this, the statement "How can we liberate Fortran from the Fortran committee?" by @jacobwilliams in https://github.com/j3-fortran/fortran_proposals/issues/96#issuecomment-557806325 should really resonate with every persevering practitioner of Fortran.

certik commented 4 years ago

@FortranFan what you and @jacobwilliams said resonates strongly with me also. All I am saying is that we need to work as a team, and together with the Fortran committee (both J3 and WG5). We need to discuss these things without alienating anyone and then come up with a constructive solution. As I said, there are many many steps towards fixing these. First is to even have a discussion (fixed by this GitHub repo). Second is to have rules and a streamlined process in the J3 committee (proposed fix in #98). Third is to have a discussion in WG5 --- I haven't been to any meeting yet, so I don't know yet what (if anything) needs to be improved there. Then the whole process must be streamlined, hopefully sped up (#36), etc. I don't have any silver bullet, I don't think there is one.

sblionel commented 4 years ago

Wow - where to begin. There is quite a bit of @FortranFan 's post that is incorrect.

J3's membership is open to anyone, but when J3 votes, only the "principal member", or a single alternate if the principal is not present, can vote in plenary session. That said, most of the direction is taken from "straw votes" where anyone present can participate.

WG5 has many member countries, though it is true that only a few tend to be represented at the annual meeting. In addition to those mentioned, Germany and Canada are usually represented. It is NOT true that votes to accept or deny features is done by single individuals. Country votes happen only on letter ballots, which happen towards the end of the process. Otherwise, all organizations represented at WG5 meetings have an equal vote, and by far the most of these are US-based.

It does not take "DECADES and DECADES" to add features, but neither does this happen overnight. My main goal as WG5 Convenor (think of it as a chairperson role) is to get the next revision out within five years, and hopefully less than that, with another revision five years after that. There has been tremendous pressure on the committee to slow down adding features so that compilers can catch up.

I'm all in favor of anything that will let us develop the language faster, but this doesn't mean rushing into designs that may not work well with the rest of the language.

certik commented 4 years ago

but this doesn't mean rushing into designs that may not work well with the rest of the language.

I just want to point out that I think we all agree on that. It's about figuring out how to achieve a design that works well with the rest of the language (in a timely manner).

gronki commented 4 years ago

I understand your point and as a chair of the organization you know many things that we are not aware of from community perspective. To me, however, this dynamics of compiler vendors vs committee is extremely weird. It's almost as if Fortran caters to their needs more than actual development of the language. And because their needs are "our customers want to run outdated f77 codes so god forbid we remove any obsolete features!!!" which is the cause of the stall, lack of any progress and invalid "backwards compatibility" arguments. On the other hand, free compilers (which are the only ones that matter) like gfortran or lfortran thrive which makes me think that maybe the lobby of non-free compiler vendors is not the best voice to listen to in the decision making.

Please keep in mind that I am not involved in Commitee's work so this is only my biased perspective as an outside observer. Please feel free to correct me if I am wrong.

Dominik

śr., 27 lis 2019 o 19:56 Steve Lionel notifications@github.com napisał(a):

Wow - where to begin. There is quite a bit of @FortranFan https://github.com/FortranFan 's post that is incorrect.

J3's membership is open to anyone, but when J3 votes, only the "principal member", or a single alternate if the principal is not present, can vote in plenary session. That said, most of the direction is taken from "straw votes" where anyone present can participate.

WG5 has many member countries, though it is true that only a few tend to be represented at the annual meeting. In addition to those mentioned, Germany and Canada are usually represented. It is NOT true that votes to accept or deny features is done by single individuals. Country votes happen only on letter ballots, which happen towards the end of the process. Otherwise, all organizations represented at WG5 meetings have an equal vote, and by far the most of these are US-based.

It does not take "DECADES and DECADES" to add features, but neither does this happen overnight. My main goal as WG5 Convenor (think of it as a chairperson role) is to get the next revision out within five years, and hopefully less than that, with another revision five years after that. There has been tremendous pressure on the committee to slow down adding features so that compilers can catch up.

I'm all in favor of anything that will let us develop the language faster, but this doesn't mean rushing into designs that may not work well with the rest of the language.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/96?email_source=notifications&email_token=AC4NA3LORJTMDSWDBWQH44DQV27ERA5CNFSM4JQQTFP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFKOK3Y#issuecomment-559211887, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4NA3NYKZXQ5DIRWQXPW73QV27ERANCNFSM4JQQTFPQ .

FortranFan commented 4 years ago

@sblionel wrote:

Wow - where to begin. There is quite a bit of @FortranFan 's post that is incorrect. ..

I'm yet to notice in any inaccuracy in my post, and when I do, I shall be the first to admit it.

Just as "shall" carries particular meaning in the Fortran standard, my post relies on the verb "can" which is "used to indicate possibility" and the possibility is something which is more than backed up by considerable evidence given how features keep getting dropped time and again from Fortran revisions, such as BITS data type from the Fortran 2008 draft and repeatedly with the topic of this thread, string utilities .

.. It does not take "DECADES and DECADES" to add features, ..

No, there is considerable evidence many features do take that long, or that matters can be worse, in that the standard may never see some features e.g., an intrinsic string type.

, Consider this paper by "ISO Meeting of Fortran Experts," all the way back from 1982: https://wg5-fortran.org/N001-N1100/N052.txt. Now, consider a couple of sections from this:

8.  Bit Data Type
    -------------
    This is very important for certain types of application. There has
    been inconclusive debate over whether it should be in the core
    language.

No amount of input from the practitioners of Fortran appears to "settle" this debate, why is that? Why are there continued arguments against adding a feature and why is the rationale against doing something circular or non-technical (e.g., wait for compilers to catch up) so often? It took 10 years - a DECADE for Fortran 2018 to be published, a minor revision to boot, and there is still a call for wait for compilers to catch up with 2008.

12. Character Data Type Extensions
    ------------------------------
    Academic computer scientists have poked fun at Fortran for many
    years and it is feared that having two such closely parallel but
    different facilities as STRING and CHARACTER will give renewed
    cause for mirth. More importantly, it will cause confusion to
    users of the language, and judging by Fortran 77, possibly also to
    implementors. Since STRING is more powerful and more general it
    alone should be in the core and CHARACTER should be relegated to
    the compatibility module.

The above two paragraphs are proof-positive of the wait for decades by the practitioners of Fortran in spite of them having conveyed their needs and which had the recognition by the so-called "experts". These couple of basic features were up for discussion at this year WG5 meeting, what a coincidence!

And what happened exactly?

BITS data type got deferred,
There is still no STRING intrinsic type,
Even as the CHARACTER type remains an intrinsic type with limitations such as with arrays of varying lengths and Fortran continues to mocked at and remains laughable, all but one of the utilities - the topic of this thread - gets dropped from the list.

So it was back in 1982 with BITS type, it was recognized, "This is very important for certain types of application" 36 years later i.e., last year we had a project (where I work) that could really have used this data type in the code design. That's well over 3 decades later and the feature is still missing.

The same with string utilities.

.. There has been tremendous pressure on the committee to slow down adding features so that compilers can catch up. ..

By whom, 1 or perhaps 2 national bodies? As evidenced in https://isotc.iso.org/livelink/livelink?func=ll&objId=20648817&objAction=Open and https://isotc.iso.org/livelink/livelink?func=ll&objId=20632887&objAction=Open? Or due to one specific compiler vendor exerting outsized influence on these national bodies? Regardless, this goes back to my earlier comment about how proposals can peter out at the WG5 level.

The users of Fortran the world over with forums online, their input to WG5 survey itself are feeding back the exact opposite: see comments such as this:

"Is it too late now to start this project? ..The list of new features to be added in Fortran 202X has been finalized in last August at the Tokyo WG5 meeting. More specifically, exception handling has been unfortunately rejected " https://groups.google.com/d/msg/comp.lang.fortran/dFenjU25o9k/Z5MMiXyRAAAJ
"Personally, it's very sad that various features are postponed or even rejected for this round of revision (which may be reasonable for the committee for various reasons), and I'm (very personally) sad that a builtin "string" type seems not considered even as a revision candidate" https://groups.google.com/d/msg/comp.lang.fortran/dFenjU25o9k/-t-OEW2aAAAJ
".. in 2019, people will use C++ instead .." https://groups.google.com/d/msg/comp.lang.fortran/dFenjU25o9k/srcYqhXYAAAJ
".. I am afraid that "waiting for most compilers to catch up" could mean "forever" in practice..." https://groups.google.com/d/msg/comp.lang.fortran/dFenjU25o9k/UVwbjkraAAAJ

milancurcic commented 4 years ago

@gronki I get it and I don't think you're wrong, and yes, you're biased just like me and everybody else here with specific perspective and needs. Here's my biased perspective.

To me it makes perfect sense that compiler vendors follow their bottom-line (needs of customers who pay the most), and the committee is in big part made of representatives from vendors. I don't find it weird at all, and I'd find it weird if it were any other way.

I also understand your and some other people's disregard for backward compatibility, but this is subjective. For instance, I'm an application developer and member of the community, and I care about backward compatibility and consider it one of Fortran's great strengths. I also think backward compatibility is a red herring being considered an obstacle to progress. It's quite possible to advance the language while preserving backward compatibility. What I see as real obstacles are the disconnect from the community and outdated, slow processes.

I also think it's subjective to consider only free compilers to matter. For me, both free and commercial compilers are essential. The former mostly for development, the latter for production.

So what do we do? Can we try to work with the committee and help them make adjustments? If you care about sticking with and advancing Fortran, commercial vendors, committees and backward compatibility are part of the course. I will keep asking "for whom the Fortran Standard Committee?", but at some point you got to ask where do you want to go and how can you best get there.

certik commented 4 years ago

@FortranFan I think the only reason I would like features to be rejected is if they are not ready, or they really should not belong into Fortran (see #59). I am happy the exceptions got rejected, because the feature is simply not ready (too easy to make things worse by putting a half-thought out feature in). Regarding the string type on the other hand, that seems like a good example of what you are talking about that it seems to take decades to get it in.

certik commented 4 years ago

@milancurcic I agree. We have to try our best to work with the committee. But it goes both ways, the committee must try its best to work with the wider Fortran community. What I have seen so far (and this GitHub repository is a proof of that) is that the wider community is eager to work with the committee, if the committee is willing to reach out.

sblionel commented 4 years ago

By the numbers - J3 currently has 15 members; only four are from vendors (Cray, IBM, Intel and Nvidia - Malcolm Cohen works for NAG but NAG dropped their membership; Malcolm is one of my alternates now). Most vendor reps are from the support teams (as I was), not development, and are close to what their customers are looking for.

sblionel commented 4 years ago

@milancurcic I agree. We have to try our best to work with the committee. But it goes both ways, the committee must try its best to work with the wider Fortran community. What I have seen so far (and this GitHub repository is a proof of that) is that the wider community is eager to work with the committee, if the committee is willing to reach out.

And we have been doing exactly that - witness the survey we ran for more than half a year, with more than 130 detailed responses from the user community, that fed directly into the planning for Fortran 202X. This github forum is fine, but so far it's mainly a lot of arguing. If I wanted that, I'd go to a J3 meeting.... Oh, wait...

certik commented 4 years ago

@sblionel I must react to this:

This github forum is fine, but so far it's mainly a lot of arguing.

I don't think that's accurate. If you browse the issues:

https://github.com/j3-fortran/fortran_proposals/issues

The vast majority are constructive issues (including this one at the top) of what people are requesting.

There are only a very few issues where we have off-topic discussions (like this one). The reason for that is that the community feels the committee does not have a real discussion with the wider community about features, and there is a frustration in the community about how to even submit a proposal that will be considered (see #98).

The survey you did was great, thank you for doing it, and it should be part (although not the only thing) of what the committee is doing in order to engage the wider community.

gronki commented 4 years ago

yeah my thread for string function proposals burned :(

śr., 27 lis 2019 o 23:09 Ondřej Čertík notifications@github.com napisał(a):

@sblionel https://github.com/sblionel I must react to this:

This github forum is fine, but so far it's mainly a lot of arguing.

I don't think that's accurate. If you browse the issues:

https://github.com/j3-fortran/fortran_proposals/issues

The vast majority are constructive issues (including this one at the top) of what people are requesting.

There are only a very few issues where we have off-topic discussions (like this one). The reason for that is that the community feels the committee does not have a real discussion with the wider community about features, and there is a frustration in the community about how to even submit a proposal that will be considered (see #98 https://github.com/j3-fortran/fortran_proposals/issues/98).

The survey you did was great, thank you for doing it, and it should be part (although not the only thing) of what the committee is doing in order to engage the wider community.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/96?email_source=notifications&email_token=AC4NA3IY6X6SUFYA2YACH6TQV3V2JA5CNFSM4JQQTFP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK33UY#issuecomment-559267283, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4NA3OYQ3A4APMWS7UDPFLQV3V2JANCNFSM4JQQTFPQ .

certik commented 4 years ago

yeah my thread for string function proposals burned :(

Yeah, I am sorry. I do believe this is only temporary, as we build trust and fix the committee processes (#98), and once people can see how the process works, there won't be a need to argue about the process.

FortranFan commented 4 years ago

@gronki wrote:

yeah my thread for string function proposals burned ..

This can be a recurring issue with such proposals because the barriers of them into the language are mostly non-technical and these barriers have been around for a long time. The programming needs have long been recognized e.g., that paper I linked above with STRING type being mentioned back in 1982!

FortranFan commented 4 years ago

@certik wrote:

yeah my thread for string function proposals burned :(

Yeah, I am sorry. I do believe this is only temporary, as we build trust and fix the committee processes (#98), and once people can see how the process works, there won't be a need to argue about the process.

As I've mentioned before, kudos to you on a great initiative here.

For the sake of Fortran, I really do hope the rules and work process fall into place nicely allowing everyone to "build trust" which is so critical. Once you achieve that, perhaps this community can progress to a state where such an online collaboration forum can become a productive development platform also. To paraphrase from my comment in that thread at comp.lang.fortran:

adopt *more* of the modern options involving online collaboration toward at
least the aspects in language development which fall mostly in the category
of that mentioned in "What is New in Fortran 2018" document at the WG5
website i.e., "Features that address deficiencies and discrepancies".  In my
mind, what is suggested in the original post here with string utilities
falls under this bracket.

A large fraction of the scope and effort toward feature enhancements such as
these, the "minor" ones per Modern Fortran Explained, is decidedly limited.
But such a worklist can be very tedious and quite burdensome if it is
approached in the traditional manner of serialized processing by a small
subcommitee of a burgeoning number of such requests from the users.

However, offloading a lot of such feature develoment effort, especially the
grunt work that is otherwise constant with each Fortran standard revision,
to a more modern development model which also involves crowd-sourcing from keen
Fortranners globally via online collaboration platforms and which often garners
24x7x365 engagement from the enthusiastic Fortran community, can really help
Fortran with parallelized and semi-automated advancement.

What is mostly required is enumeration and enunciation by Fortran (sub)
committees of a basic set of language semantics (rules) and (other) requirements
and constraints which need to be kept in mind while developing features.  The
"crowd" can then iron out a lot of wrinkles in its own ideas, and even reject
a bunch of them.

The standard (sub)committee(s) would then review, refine, and redirect
development and hopefully reduce its own burden along the way.

My bottom-line message: at least with "Features that address deficiencies and
discrepancies" in Fortran, the standard body would do well to consider alternate
options to develop proposals which then allow the introduction of MORE as well
as SPEEDIER refinements in the language.  The traditional approach of having to
join committees and attend physical meetings is only possible for a select few.

There is a need to continuously improve the work processes for faster development.

certik commented 4 years ago

Let's fix 90% of this issue by implementing the string utilities into stdlib (#104) and by making stdlib a success.

zbeekman commented 4 years ago

To get back on topic:

IMO, the hardest part of writing decent string functions are:

The awkwardness of character arrays
Unknown string kinds that must queried at configure time. (e.g., Is "DEFAULT" "ASCII"? Do we have "ISO_10646"? Does the processor provide additional kinds for, e.g. Kanji?)

So as far as things that the standard could do to help with this are in two main areas in my opinion, and at least partially orthogonal from whether this should be in the standard or in a library:

Better generic handling of intrinsic types with different kind parameters. Maybe this is via templating (#125). Maybe this is via standardized pre-processors (#65). Maybe this is by intent(out) and function return variables being allowed to adopt the same kind as an intent(in) variable (#128). Maybe this is something different that I can't think of right now.
Better handling of character arrays. Maybe this means optional reallocation of an array of strings on assignment (yuck! I hope not.) Maybe this means a new intrinsic string type or class.

There are admittedly more work arounds for variable length string arrays, but these are at times still hampered by bugs in compiler implementations of allocatable, scalar characters in UDTs.

SPLIT() is a good case study, as it demonstrates the awkwardness of both of the issues I highlighted above.

certik commented 4 years ago

(@zbeekman I created a new issue #128 for your "infer precision" idea, and linked existing issues for your other ideas by editing your comment.)

urbanjost commented 4 years ago

A basic string request I see all the time, as mentioned in several topics here is converting strings to numeric values and vice-versa. Many of us have such functions, often just based on a simple internal read and right. But if you extend a few intrinsics with those routines (which I have done if anyone is interested) you end up with very "Fortranic" functions that would fulfill a common need. How about if int(), real(), dble() take CHARACTER as well as numeric types? CHAR() does not quite extend naturally to something that goes the other way (maybe) but an additional function to convert numeric values to strings will fill out the set. Also, allowing CLASS(*) numerics where the TYPE matches one of the currently supported values (ie. numeric types) and CHARACTER would be very useful for writing functions that take many types without a template or repeated generic implementations for various types. So you could write something like

elemental function something(value)
class(*),intent(in) :: value
real :: val
    val=real(value)
 ..
 ..
end function something

would allow you to write a function that promotes(demotes) any scalar intrinsic of numeric type or string to a REAL, for example. You could call this routine with

   a=something('100.345e2')
   a=something(10)
   a=something(300.d0)'

I think this stays within and natually extends the Fortran syntax and solves the need for a commonly needed functionality (It's "Fortranic"!). A check on range (so "something(huge(0.0d0)" gets caught) would be a nice touch.

In addition

  character(len=:),allocatable :: string
  real :: value
   string='my answer is '//value

should "just work" as well. Assuming the new function STR() existed that converted anything intrinsic type to a string, this would be equivalent to "string='my answer is'//str(value).

I see no need for STR() to be limited to a single parameter. If it allowed for (say 9) metamorphic values something like call proc(str("my message is ",.true.," and the value is ",100.3)...) would allow strings to be easily generated and passed as input without having to do an internal WRITE first, for example.

As an example, extending the DBLE(3f) intrinsic using f2008-compliant code:

!-----------------------------------------------------------------------------------------------------------------------------------
module M_extend
   use, intrinsic :: iso_fortran_env, only : int8, int16, int32, int64
   use, intrinsic :: iso_fortran_env, only : real32, real64, real128
   implicit none
   private
   public dble                      ! extend intrinsics to accept CHARACTER values and LOGICALS
   interface dble
      module procedure anyscalar_to_double
   end interface
contains
!-----------------------------------------------------------------------------------------------------------------------------------
   pure elemental function anyscalar_to_double(valuein) result(d_out)
      use, intrinsic :: iso_fortran_env, only : error_unit !! ,input_unit,output_unit
      implicit none

!$@(#) M_anything::anyscalar_to_double(3f): convert integer or real parameter of any kind to doubleprecision

      class(*),intent(in)       :: valuein
      doubleprecision           :: d_out
      doubleprecision,parameter :: big=huge(0.0d0)
      character(len=3)          :: nanstring
      select type(valuein)
       type is (integer(kind=int8));   d_out=real(valuein,kind=real64)
       type is (integer(kind=int16));  d_out=real(valuein,kind=real64)
       type is (integer(kind=int32));  d_out=real(valuein,kind=real64)
       type is (integer(kind=int64));  d_out=real(valuein,kind=real64)
       type is (real(kind=real32));    d_out=real(valuein,kind=real64)
       type is (real(kind=real64));    d_out=real(valuein,kind=real64)
       Type is (real(kind=real128))
         if(valuein.gt.big)then
            !!write(error_unit,*)'*anyscalar_to_double* value too large ',valuein
            nanstring='NaN'
            read(nanstring,*) d_out
         else
            d_out=real(valuein,kind=real64)
         endif
       type is (logical);              d_out=merge(0.0d0,1.0d0,valuein)
       type is (character(len=*));     read(valuein,*) d_out
       class default
         !!stop '*M_anything::anyscalar_to_double: unknown type'
         nanstring='NaN'
         read(nanstring,*) d_out
      end select
   end function anyscalar_to_double
!-----------------------------------------------------------------------------------------------------------------------------------
end module M_extend
!-----------------------------------------------------------------------------------------------------------------------------------
program testit
   use M_extend
   implicit none
   ! make sure normal stuff still works
   write(*,*)'##CONVENTIONAL'
   write(*,*)'INTEGER         ', dble(10)
   write(*,*)'INTEGER ARRAY   ', dble([10,20])
   write(*,*)'REAL            ', dble(10.20)
   write(*,*)'DOUBLEPRECISION ', dble(100.20d0)
   ! extensions
   write(*,*)'##EXTENSIONS'
   write(*,*)'CHARACTER       ', dble('100.30')
   write(*,*)'CHARACTER ARRAY ', dble([character(len=10) :: '100.30','400.500'])
   ! call a function with a metamorphic argument
   write(*,*)'METAMORPHIC     ', promote(111)
   ! settle this once and for all
   write(*,*)'LOGICAL TRUE    ', dble(.true.)
   write(*,*)'LOGICAL FALSE   ', dble(.false.)
   write(*,*)'LOGICAL ARRAY   ', dble([.false., .true., .false., .true.])
contains
   function promote(value)
      class(*),intent(in) :: value
      doubleprecision     :: promote
      promote=dble(value)**2
   end function promote
end program testit
!-----------------------------------------------------------------------------------------------------------------------------------

Handles CHARACTER strings easily:

##CONVENTIONAL
INTEGER            10.000000000000000     
INTEGER ARRAY      10.000000000000000        20.000000000000000     
REAL               10.199999809265137     
DOUBLEPRECISION    100.20000000000000     
##EXTENSIONS
CHARACTER          100.30000000000000     
CHARACTER ARRAY    100.30000000000000        400.50000000000000     
METAMORPHIC        12321.000000000000     
LOGICAL TRUE       0.0000000000000000     
LOGICAL FALSE      1.0000000000000000     
LOGICAL ARRAY      1.0000000000000000        0.0000000000000000        1.0000000000000000        0.0000000000000000

FortranFan commented 4 years ago

@urbanjost wrote:

A basic string request I see all the time, as mentioned in several topics here is converting strings to numeric values and vice-versa. ..

@urbanjost, are you active on https://github.com/fortran-lang/stdlib work? Is something like "to_string" (perhaps, along the lines of C++ stdlib, or something better!) in the works there? If not, you may want to collaborate to get that added.

sblionel commented 4 years ago

A basic string request I see all the time, as mentioned in several topics here is converting strings to numeric values and vice-versa.

I always assumed that people who ask this do so because of their familiarity with such functions in other languages. Fortran's approach is quite different, though if you step back a bit it's a lot like sprintf in C. The advantage of a function is that it's easier to reference in an expression.

I could see putting a simple version in a library, though I expect it would instantly get requests for additional formatting flexibility. I suppose one could pass a FORMAT string as an optional argument.

septcolor commented 4 years ago

I always have a question that "Why doesn't Fortran introduce a decent string type?" It fails even for a simple case likemerge( "true", "false", flag ). It even cannot read input from stdin while determining its length automatically (which is a piece of cake in C++ and D...). I think "character(:), allocatable" is not a solution for future because ALLOCATABLE is not part of a type.

j3-fortran / fortran_proposals

list of needed string utilities #96