j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
178 stars 15 forks source link

Config variables to provide a CLI #25

Open cmacmackin opened 4 years ago

cmacmackin commented 4 years ago

This is a somewhat unusual feature request, but Cray's new parallel language Chapel has an interesting approach to providing a command line interface. The programmer can define "config variables" and these can be set by command-line flags when executing the program (or take a default value). Parameters can also be config, in which case they can be specified to the compiler, much like macro definitions to the preprocessor. I thought this was a really clever, simple way to program a CLI. In Fortran config could just be an attribute of a variable, giving a syntax along the lines of

program hello_world
  implicit none
  integer, config :: n = 5
  integer :: i
  do i = 1, n
    print*, "Hello world"
  end do
end program hello_world

Then, at execution,

./hello_world
# Hello world
# Hello world
# Hello world
# Hello world
# Hello world

./hello_world -n=2
# Hello world
# Hello world

Some tricky issues to consider:

certik commented 4 years ago

@cmacmackin thanks for the issue, I think this is a great idea, I haven't thought of this before. Thanks also for being involved here, I think there many many excellent ideas out there, and we just need to organize ourselves to collaborate on good proposals.

Another thing to consider here is:

I want to be able to prototype features like these in LFortran, realistically I am probably 6 to 12 months away to be able to do that. And I am hoping that Flang and also hopefully GFortran could be used for prototyping, as ideally we really need two independent implementation of any major feature that goes into Fortran. Right now we are standardizing things without having a single implementation, and we need to fix that.

cmacmackin commented 4 years ago

How would this feature interact with the user managing the command line arguments? In Python there are libraries to handle command line arguments. Should Fortran have language constructs to do so? (I think that in general yes, the Fortran way is that good features eventually become part of the language itself.)

Users would not be required to utilise this feature and the intrinsics like get_command_argument would still remain part of the standard. This would allow any existing CLI libraries to continue working if users prefer them. I can see some use cases where a library might be preferable:

The first two of these would, I think, be quite difficult to incorporate directly into the language via config variables. The second one, I think, would be impossible unless documentation comments were integrated into the standard, which would be a pretty major semantic change.

certik commented 4 years ago

What happens when you handle arguments yourself, and then use this config attribute?

Should the compiler issue a warning?

On Sat, Oct 19, 2019, at 4:31 PM, Chris MacMackin wrote:

How would this feature interact with the user managing the command line arguments? In Python there are libraries to handle command line arguments. Should Fortran have language constructs to do so? (I think that in general yes, the Fortran way is that good features eventually become part of the language itself.)

Users would not be required to utilise this feature and the intrinsics like get_command_argument would still remain part of the standard. This would allow any existing CLI libraries to continue working if users prefer them. I can see some use cases where a library might be preferable:

  • handling positional arguments
  • implementing sub-commands
  • automatically produce output for the --help flag The first two of these would, I think, be quite difficult to incorporate directly into the language via config variables. The second one, I think, would be impossible unless documentation comments were integrated into the standard, which would be a pretty major semantic change.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/25?email_source=notifications&email_token=AAAFAWEYAQTTAWG46F6AR5LQPOKFJA5CNFSM4JCRHG7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBX6YAA#issuecomment-544205824, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFAWD26GEAZWEIW73VEXLQPOKFJANCNFSM4JCRHG7A.

cmacmackin commented 4 years ago

My instinct would be that this should not be required, although there'd be nothing to stop individual compiler vendors from doing so. The config variables are just a high-level way of handling the arguments but would not prevent you from using the older, low-level approach. As an example, if you are using a Python library such as argparse, you won't get any warning if you also use sys.argv.

certik commented 4 years ago

Ah I see --- the compiler can still set your n above automatically and you can still process the command line by hand. The only issue that I can see is error handling -- what happens if the user passes a string and not a number for n, or if the command line is malformed and the user uses the config approach.

cmacmackin commented 4 years ago

I've checked the Chapel documentation and they don't actually say what happens if there is a type-mismatch. My suggestion would be that the value of the config variable is undefined in that case. Something like the following intrinsic subroutine could be added to check that read-in was successful:

subroutine check_config_status(var, status)
    class(*), intent(in) :: var
    integer, intent(out) :: status ! 0 if read-in successful, non-zero otherwise.
end subroutine

Alternatively, we could take the approach that all config variable hold their default values until the user calls some intrinsic subroutine to populate them. This could take optional status and message arguments. It would look something like this:

subroutine update_config_variables(error, message)
    integer, intent(out), optional :: error ! 0 if read-in successful
    character(len=:), allocatable, intent(out), optional :: message ! Message explaining cause of any error
end subroutine

There is a question of whether this should set the variables globally, or only in a given scope. Restricting to a given scope would provide better modularity for any library routines, as they would work without depending on update_config_variables being called elsewhere. However, the feature I propose below to deal with malformed command lines would work best if they were set globally.

Malformed command-lines would be a bit less clear of how to deal with but my suggestion would be an additional intrinsic which would return all command-line arguments which could not be parsed as config variables because they don't match any config variable names. It could look something like

subroutine get_nonconfig_command_arguments(list)
    character(len=:), dimension(:), allocatable, intent(out) :: list
end subroutine

Alternatively, I suppose some new intrinsic derived type could be defined to return this information.

cmacmackin commented 4 years ago

I've checked the Chapel documentation and they don't actually say what happens if there is a type-mismatch. My suggestion would be that the value of the config variable is undefined in that case. Something like the following intrinsic subroutine could be added to check that read-in was successful:

subroutine check_config_status(var, status)
    class(*), intent(in) :: var
    integer, intent(out) :: status ! 0 if read-in successful, non-zero otherwise.
end subroutine

Alternatively, we could take the approach that all config variable hold their default values until the user calls some intrinsic subroutine to populate them. This could take optional status and message arguments. It would look something like this:

subroutine update_config_variables(error, message)
    integer, intent(out), optional :: error ! 0 if read-in successful
    character(len=:), allocatable, intent(out), optional :: message ! Message explaining cause of any error
end subroutine

There is a question of whether this should set the variables globally, or only in a given scope. Restricting to a given scope would provide better modularity for any library routines, as they would work without depending on update_config_variables being called elsewhere. However, the feature I propose below to deal with malformed command lines would work best if they were set globally.

Malformed command-lines would be a bit less clear of how to deal with but my suggestion would be an additional intrinsic which would return all command-line arguments which could not be parsed as config variables because they don't match any config variable names. It could look something like

subroutine get_nonconfig_command_arguments(list)
    character(len=:), dimension(:), allocatable, intent(out) :: list
end subroutine

Alternatively, I suppose some new intrinsic derived type could be defined to return this information.

tclune commented 4 years ago

There are (at least) 2 open source projects that attempt to handle this issue.

FLAP: https://github.com/szaghi/FLAP and fArgParse: https://github.com/Goddard-Fortran-Ecosystem/fArgParse

Possibly I am a bit biased as the author of fArgParse, but I do have additional reasons to prefer it. Namely, I am trying to build an entire ecosystem around containers from gFTL. At the same time FLAP does have some functionality that I've not yet added to fArgParse.

cmacmackin commented 4 years ago

Yes, I am aware that libraries exist for this and as such config variables are really just a "nice to have" feature and not a necessity in the way that (say) generic programming is. However, I do think they provide a simpler and quicker to use approach than the existing libraries.

tclune commented 4 years ago

I agree that something built-in would have several benefits, but my intuition is that an argument parsing facility itself is unlikely to rise particularly high in the priority list. There would likely be more support for the "config" feature, as it would support many additional use cases.

Currently the thinking is that improved generics will simplify the development of containers in general. And the "config" described here is merely a special case of a container. There are various approaches to config implementation, and the committee may well be shy of committing to a particular approach. E.g., one could have everything be strings and then require users to cast to other intrinsic types, or one could do something like YAML where certain "obvious" items are internally converted to logical, integer, and real.

If the committee does attempt to add direct support for containers, my guess would be that List and Map (ala C++ STL) would be higher priority than Config. But those 2 containers would significantly reduce the difficulty in implementing Config.

FortranFan commented 4 years ago

I agree that something built-in would have several benefits, but my intuition is that an argument parsing facility itself is unlikely to rise particularly high in the priority list. There would likely be more support for the "config" feature, as it would support many additional use cases. ..

Also, it's unclear how a "config"attribute can translate into a semantically secure and reasonably featured facility for a broad set of users. The use case shown in the original post appears insufficient toward this. If that's the need, the Fortran standard tacitly suggests the practitioners of Fortran to follow a rather VERBOSE approach with their own "glue code" or 'use' it from libraries (which is not unlike many other aspects involving Fortran) as shown below that may be of interest to other users of Fortran reading this issue.

program hello_world

   implicit none

   integer :: n
   integer :: i, istat

   call get_n(n, istat)
   if ( istat /= 0 ) then
      print *, "Unable to fetch the value of command argument n"
      stop
   end if

   do i = 1, n
      print*, "Hello world"
   end do

contains

   subroutine get_n( nval, stat )

      integer, intent(out)   :: nval
      integer, intent(inout) :: stat

      integer, parameter :: n_default = 5
      character(len=256) :: cmd
      integer :: narg, lenc

      nval = n_default ; stat = 0

      narg = command_argument_count()
      if ( narg >= 1 ) then
         call get_command_argument(1, cmd, lenc, stat)
         if ( stat == 0 ) then
            if ( lenc >= 4 ) then
               ! N.B. assuming single-digit value of n
               if ( cmd(1:3) == "-n=" ) read( cmd(4:4), "(i1)", iostat=stat ) nval
            end if
         end if
      end if

      return

   end subroutine

end program hello_world