vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
982 stars 118 forks source link

List of supported platforms #426

Open QuantamHD opened 1 year ago

QuantamHD commented 1 year ago

Is there a desirable list of platforms that must be supported by form?

I'd have some desire to help refactor, and moving some of the sources to modern c++17. That may not be desirable depending the platforms that must be supported.

May also not be helpful.

tueda commented 1 year ago

Thank you for opening an issue with the good question.

I think the "tier 1" targets are x86_64-linux systems. The macOS (like other Unixes), Windows (possibly without POSIX) and 32-bit systems are for now more or less maintained with some limitations, though I am not sure how many people use them (see also comments in https://github.com/vermaseren/form/issues/422).

Moving to modern C++17 (or C++20) sounds, to me, very appealing. But at the same time, I am afraid that it possibly makes a high hurdle for developers and newcomers.

C++ may be considered too "expert-friendly" for writing code. And to build C++17 code, one needs a decent version of a C++ compiler. Recently I saw a popular high-energy physics software switching to C++17 and now it requires gcc>=9 (they say gcc 8 does not work; I haven't confirmed but I guess it relates to <filesystem>); so it does not work with Ubuntu 18.04 LTS default gcc (7.4.0). On computing servers/clusters, occasionally upgrading gcc/OS version may be tricky and/or risky. As an extreme case, I still maintain a 10-year-old CentOS6 cluster (yes, it's very ancient). By the way, as a historical remark, FORM source code has been written in a rather conservative way in the sense of changing C versions; ANSI C (C89) was adopted over 10 years ago. Before that, it was written with some macros that work both in the K&R style and ANSI C (see these commits if you are curious).

Probably others have different opinions. Maybe we need some polls.

jodavies commented 1 year ago

I haven't tested for a few years, but in principle the code compiles and runs on arm and power. With the rise of arm server CPUs, one should not throw such support away intentionally...

vermaseren commented 1 year ago

The target in he programming of Form has always been to use the simplest features of the language in such a way that it is as language conform as possible (no warnings etc). I know that newer and newer standards make things sometimes easier on the programmer, but the big question remains wheter it also makes Form better. The original C version, and still the parts that I program, are in a very simple version of C that allows even bad compilers to make efficient code. In C++ there is always the danger to use many types of constructions that, when translated to machine code may not be as efficient as you think. They are very handy for the programmer of course and may save much human time, but the main target in Form is not to spend much computer time. For many applications computer time may not be that relevant, but many people run Form programs for days, or weeks, or even months. In that case efficiency is extremely important. Only in code that is not timing critical one may use less efficient algorithms/code.

The above Philosophy has made Form extremely portable as well. Specially in the beginning, when there were many different computers, each with their own compilers, that was a big issue. But it is still so, that when I had a 64 bits Ubuntu installed on my Raspberry Pi, the translation of Form worked first time around and it ran without any problems. It is a feature one would not like to loose. And also on my Apple power book with an M1max chip it gave no problems. (Fastest execution time of my benchmarks I have seen till now). There were only a few warnings in the C++ code about sprintf, which were easily removed by using snprintf (till then I did not even know it existed).

I hope this explains a bit. We make that it runs, but we would also like it to run on older versions. The 64/32 bits is a different discussion, because at times it requires extra work.

On 6 Dec 2022, at 07:16, Takahiro Ueda @.***> wrote:

Thank you for opening an issue with the good question.

I think the "tier 1" targets are x86_64-linux systems. The macOS (like other Unixes), Windows (possibly without POSIX) and 32-bit systems are for now more or less maintained with some limitations, though I am not sure how many people use them (see also comments in #422 https://github.com/vermaseren/form/issues/422).

Moving to modern C++17 (or C++20) sounds, to me, very appealing. But at the same time, I am afraid that it possibly makes a high hurdle for developers and newcomers.

C++ may be considered too "expert-friendly" for writing code. And to build C++17 code, one needs a decent version of a C++ compiler. Recently I saw a popular high-energy physics software switching to C++17 and now it requires gcc>=9 (they say gcc 8 does not work; I haven't confirmed but I guess it relates to ); so it does not work with Ubuntu 18.04 LTS default gcc (7.4.0). On computing servers/clusters, occasionally upgrading gcc/OS version may be tricky and/or risky. As an extreme case, I still maintain a 10-year-old CentOS6 cluster (yes, it's very ancient). By the way, as a historical remark, FORM source code has been written in a rather conservative way in the sense of changing C versions; ANSI C (C89) was adopted over 10 years ago. Before that, it was written with some macros that work both in the K&R style and ANSI C (see these commits https://github.com/vermaseren/form/compare/4e421de...39cbd4e if you are curious).

Probably others have different opinions. Maybe we need some polls.

— Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/426#issuecomment-1338822463, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCEUG2W4GSD2AZHHFOATWL3K4HANCNFSM6AAAAAASTYIV7M. You are receiving this because you are subscribed to this thread.

AdrianBunk commented 1 year ago

And to build C++17 code, one needs a decent version of a C++ compiler. Recently I saw a popular high-energy physics software switching to C++17 and now it requires gcc>=9 (they say gcc 8 does not work; I haven't confirmed but I guess it relates to <filesystem>); so it does not work with Ubuntu 18.04 LTS default gcc (7.4.0).

gcc 7 (released in May 2017) has pretty complete support for C++17: https://gcc.gnu.org/projects/cxx-status.html#cxx17

<filesystem> with gcc < 9 requires linking with -lstdc++fs, and gcc 7 has the header in <experimental/filesystem>.

If supporting older compilers is not a priority for the maintainers of this high-energy physics software, then not supporting (and testing) gcc 7 is a fair option for them and less work than dealing with such differences.

QuantamHD commented 1 year ago

Thanks for the useful context.

I come from a distributed systems/HPC background, and certainly understand the limitations around older OSs, and compilers. Which is why I asked this question in the first place.

I also appreciate the primary purpose of FORM being speed. I think my question is how does the project balance that speed against maintainability, or is it even an issue?

For example I noticed that the parser for FORM looks like an hand written parser, possily LALR? Could the program benefit from a parser generator instead, or is a custom parser the right move because source files might measure in TB? I'm certainly naive to some of these tradeoffs.

Moving to a modern development style could make it easier for external parties to contribute, but it's always a tradeoff. I think one area that could be improved is the debug logging code by using a more modern logging library like spdlog. It supports the same kind of ifdef debugging, but with far fewer lines of code.

Like I said if there's something I can do that would be helpful I'm happy to contribute, but I don't want to force suggestions that are misaligned with the project goals.

vermaseren commented 1 year ago

Of course any help is welcome, as long as it does not compromise the speed of Form and its capability to deal with gigantic expressions. Personally I am not so much up to date with the latest compiler developments. The current compiler part of Form dates from before 2000. And you do not want to see what it had before that. I am not against updating that again. I know that it is both healthy and inevitable. It is just that if you reprogram the compiler part you do not want to break peoples programs unless you can also provide a more or less complete conversion program with it. I had to do that in the transition from version 2 to version 3, but even so there were people who wanted to keep using version 2, because their programs had been written by a student who had left already a few years before.

Form also uses very few external libraries. This is still a leftover from the poor quality of libraries in the past. At a certain moment I had to give that up of course with zlib, gmp. and now mpfr.

In the 80’s and early 90’s we had some rather nasty experiences with using libraries. The experimentalists were all using the CERN library of course. But when we wanted to test new computers to see which one we should select for our new mainframe those programs would not run, because nearly all routines in the CERN library were eventually using a few routines that were written in assembler language. The only programs that could be used were programs written by a few theorists that did not use the CERN library. That is also why I do not want assembler code in the Form sources.

On 6 Dec 2022, at 11:58, Ethan Mahintorabi @.***> wrote:

Thanks for the useful context.

I come from a distributed systems/HPC background, and certainly understand the limitations around older OSs, and compilers. Which is why I asked this question in the first place.

I also appreciate the primary purpose of FORM being speed. I think my question is how does the project balance that speed against maintainability, or is it even an issue?

For example I noticed that the parser for FORM looks like an hand written parser, possily LALR? Could the program benefit from a parser generator instead, or is a custom parser the right move because source files might measure in TB? I'm certainly naive to some of these tradeoffs.

Moving to a modern development style could make it easier for external parties to contribute, but it's always a tradeoff. I think one area that could be imported is by using a more modern logging library like spdlog. It supports the same kind of ifdef debugging, but with far fewer lines of code.

Like I said if there's something I can do that would be helpful I'm happy to contribute, but I don't want to force suggestions that are misaligned with the project goals.

— Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/426#issuecomment-1339143987, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCEVRNREBIW7WRFIFD5TWL4L5ZANCNFSM6AAAAAASTYIV7M. You are receiving this because you commented.