NREL / ROSCO

A Reference Open Source Controller for Wind Turbines
https://rosco.readthedocs.io/en/latest/
Apache License 2.0
116 stars 93 forks source link

DISCON input file encoding #260

Open pelljam opened 1 year ago

pelljam commented 1 year ago

Hello,

I just wondered if you could clarify what encoding is supported by DISCON for the primary input file? I notice in the control_interface.py you encode using UTF-8:

self.accINFILE = self.param_name.encode('utf-8')

However when I try to use a path with Unicode characters, it can't find the file? This is on Windows.

Thanks,

James

dzalkind commented 1 year ago

Hi,

I might need a little more information to help you. What exactly is your use case and the issue you are seeing?

Are you trying to run with the python control interface or something else? If possible, use quotes in your paths and try to simplify it.

Best, Dan

pelljam commented 1 year ago

Hi Dan,

Thank you for your reply.

We maintain our own wrapper/interface to the ROSCO DLL. When we pass the input file path to this, currently we encode using ASCII. However, having seen your interface encoding using UTF-8, I wondered if that was supported. However, when I tried to do that and then pass a path with Unicode characters to ROSCO, it failed to find the input file.

So I was just wondering if you could confirm what encoding should be used when calling the DLL?

Thanks,

James

dzalkind commented 1 year ago

Thanks for the context, James!

I can't confirm any specific encoding. That may just be what is needed for the dynamic library set up in python. It's been a while since that was developed and no one I've asked seems to recall the specifics.

Is this discrepancy causing you issues? If not, I'm inclined to let it go and note that the DISCON.IN file path should be encoded in ASCII.

davidheff commented 1 year ago

It feels uncomfortable to restrict users to ASCII in the modern day. This was something that happened in pre Unicode days, but for 2023 it's really awkward for any real word usage. Perhaps in English speaking parts of the world it won't give anyone any trouble, but the majority of the world routinely uses characters outside ASCII.

Fundamentally what seems to be needed is a way for the code which opens files, to handle non ASCII characters in some way. I'm no Fortran expert. Does Fortran have support Unicode file names?

davidheff commented 1 year ago

My best guess is that, at least on typical Windows Fortran environments, OPEN expects filenames to be encoded with the active code page. So probably the best that you can do is try to encode any file names that way. In Python you'd using this encoding:

f"cp{ctypes.windll.kernel32.GetACP()}"

I think! But that will only get you so far. You are still in trouble if your file name / path has characters from outside the active code page. This is of course why Unicode exists.

dzalkind commented 1 year ago

Thanks for the feedback. These are valid points.

The type of the input file is determined here: https://github.com/NREL/ROSCO/blob/e3b7db779ad9e7bea5dea692e92f52b10116cf02/ROSCO/src/DISCON.F90#L49

I'm not sure we currently have the bandwidth to support this, but input is welcome from the community.

ROSCO also reads in several strings, which might require updates, too.

davidheff commented 1 year ago

It's not the format of the file itself that is the issue. It's just the handling of the file names. So when DISCON is called and the file name is passed in by the host, that file name, which for my usage needs to be a complete absolute path, can often have non ASCII characters. That file name gets passed as the first argument to OPEN. So to resolve this there'd need to be a way to open a file whose name can come from arbitrary set of characters, i.e. Unicode.

What I don't know, as I'm not a Fortran programmer, is how Unicode is typically handled for such scenarios. It would astound me though if there wasn't a clean way to do this in 2023 in Fortran, although I guess you never know with Fortran!