Projeto-Pindorama / heirloom-ng

A collection of standard Unix utilities that is intended to provide maximum compatibility with traditional Unix while incorporating additional features necessary today.
http://heirloom-ng.pindorama.dob.jp
Other
24 stars 7 forks source link

fix: remove deprecate register keywords #50

Closed callsamu closed 1 month ago

callsamu commented 2 months ago

Pull Request: Removing the "register" Keyword from the Legacy C Project

Overview

This pull request aims to remove the use of the register keyword from the legacy C codebase. The register keyword was introduced in the early days of C programming to provide a hint to the compiler that a variable should be stored in a CPU register for faster access. However, this keyword has become largely obsolete in modern C programming and its use is now discouraged.

Rationale for Removal

  1. Compiler Optimization: Modern compilers are highly sophisticated and can automatically optimize variable storage and access without the need for explicit register hints. Compilers can often make better decisions about register allocation than developers can.

  2. Portability: The register keyword is not part of the standard C language and its behavior can vary across different compilers and architectures. Removing the register keyword improves the portability of the codebase.

  3. Readability and Maintainability: The register keyword can make the code less readable and harder to maintain, as it adds an extra layer of complexity that is no longer necessary.

  4. Deprecation: The register keyword has been deprecated in the latest versions of the C standard (C99 and C11) and its use is now discouraged. Removing it aligns the codebase with modern C programming practices.

Implementation Details

  1. The register keyword has been removed from all variable declarations throughout the codebase.
  2. No other changes have been made to the functionality or behavior of the affected code.
  3. Extensive testing has been performed to ensure that the removal of the register keyword does not introduce any regressions or unexpected behavior.

Benefits

  1. Improved code readability and maintainability.
  2. Increased portability and compatibility with modern C compilers.
  3. Alignment with current C programming best practices and standards.

Conclusion

Removing the register keyword from the legacy C codebase is a straightforward change that will improve the overall quality and sustainability of the project. By eliminating this outdated and unnecessary construct, we can make the code more readable, maintainable, and compatible with modern C development practices.

please luiz, don't be a fool.

deablofk commented 2 months ago

i agree with samu

takusuman commented 2 months ago

Honestly, I'm a little bit worried to which point register is actually useless, since even GNU tools --- which are undoubtedly widespread and used in UNIX-compatible and outside it --- still using it in the modern days. I agree that, in many programs, throwing an integer into the register isn't needed in a modern computer, but what about programs that need to do fast sum? Like oawk/nawk, for example.

I would happily merge this into the 20240220-fix branch, but first I think we should discuss a little bit more about where we can remove an explicit register declaration without making something slower or if we could add some new optimization flags at the build/mk.config file after removing these from the code.

takusuman commented 2 months ago

Compilers can often make better decisions about register allocation than developers can.

We often speak of compiler optimization like it were some sort of "God's code review", I would say that nobody know better about how the program's code works as the person who programmed it. Most of these programs were initially made in a time where people knew what they were dealing with, I wouldn't say that any of the registers there were added cluelessly and that the compiler would made a better guess. For real, we could take a turn like this and remove registers because we trust on the compiler --- like the GNU Make team made in 2017 at https://github.com/mirror/make/commit/ac9721463586c85d6d2198c135b459e49088dae3 ---, but I would like to make these in a less ignorant way. For a matter of example, the dc(1) manual page was corrupted because the word "register", which meant "register" in the sense of storing a value, was just cut from it without any human mediation. Let me guess: '/register/d' was used instead of an actual regex specific for removing register in C? Besides that, I ratify that I would like to include new compiler flags for dealing with the removal of these and also undo the changes at dc(1) manual page before merging into the 20240220-fix branch.

EDIT: A read here would aid to decide which are the best flags to use now. https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Optimize-Options.html https://foss-for-synopsys-dwc-arc-processors.github.io/toolchain/gcc/optimizations.html

callsamu commented 2 months ago

Concerning the method i used to remove register from the codebase, i admit that i could have done better, therefore i will update my removal script to match the keyword inside of a C statement. It won't be hard though, since i have already written a statement matching regex to consider the case where register should be replaced by int.

As of your viewpoint regarding compiler optimizations, the programmer can't possibly know exactly how his code will behave once compiled unless he understands what changes are brought into it during the compilation process by the language implementation, since the program executed won't be constituted of his code, but rather a translation of it by another program whose internals generally won't be known. Therefore, if you lack the knowledge of how and when these optimizations are done, it is safer to avoid premature "hacks" which will obstruct the compiler's work.

I understand the context and the purpose of the keyword inside the codebase, but, as i have argued before, it's not only useless, but a sign that the project still carries a lot of legacy aspects from which it should be freed from if you really wish to rework an old and abandoned coreutils implementation into a solid alternative to the current ones which attracts contributors - after all, few people would want to deal with files which haven't been touched in decades.

I will reserve sometime to study the compilation flags matter, but i believe it won't make much of a difference.

takusuman commented 2 months ago

As of your viewpoint regarding compiler optimizations, the programmer can't possibly know exactly how his code will behave once compiled unless he understands what changes are brought into it during the compilation process by the language implementation, since the program executed won't be constituted of his code, but rather a translation of it by another program whose internals generally won't be known. Therefore, if you lack the knowledge of how and when these optimizations are done, it is safer to avoid premature "hacks" which will obstruct the compiler's work.

That's actually a pretty good point for many of the cases there, but what about preventing one from getting the variable address?

I will reserve sometime to study the compilation flags matter, but i believe it won't make much of a difference.

I will be doing it as soon as I get to my programming environment again.