NOAA-EMC / NCEPLIBS-bufr

The NCEPLIBS-bufr library contains routines and utilites for working with the WMO BUFR format.
Other
46 stars 21 forks source link

Use F90 interfaces and kinds to handle 4/8/d build issues, and only build one copy of the library #78

Closed edwardhartnett closed 1 year ago

edwardhartnett commented 3 years ago

This will come after #77

Currently we build 3 flavors of the library, 4, 8, and D. Instead, use fortran interfaces and kinds to provide one library that handles all three cases.

Not sure when this should happen, we want to try this out on a simpler project first.

rmclaren commented 2 years ago

What does ufbint_body look like? What branch are you working from (is your code checked in?)

rmclaren commented 2 years ago

What happens when you do this?

subroutine ufbint_c(bufr_unit, c_data, dim_1, dim_2, iret, table_b_mnemonic) bind(C, name='ufbint_f')
  integer(c_int), value, intent(in) :: bufr_unit
  type(c_ptr), intent(inout) ::  c_data
  integer(c_int), value, intent(in) :: dim_1, dim_2
  integer(c_int), intent(out) :: iret
  character(kind=c_char, len=1), intent(in) :: table_b_mnemonic
  real(c_double), pointer :: f_data

  call c_f_pointer(c_data, f_data)
  call ufbint_body(bufr_unit, f_data, dim_1, dim_2, iret, c_f_string(table_b_mnemonic))
end subroutine ufbint_c
jbathegit commented 2 years ago

@rmclaren I've been working in the onebuild branch, and I'd just pushed up a commit containing my latest code (see https://github.com/NOAA-EMC/NCEPLIBS-bufr/tree/onebuild) when I saw your two most-recent posts.

I can try calling ufbint_body directly, but the idea was to encapsulate that as private within the module and only expose the ufbint reference as public, which is why I was trying to go through that interface. That's the part I couldn't get working, so in my latest attempt I was just trying to see if I could call ufbint_isoc directly, rather than going through the ufbint interface. My thinking was that, if we could at least get that part working, then I could go back and try to call the ufbint interface directly from within ufbint_c.

rmclaren commented 2 years ago

my suggestion above should be equivalent to calling you ufbint_d (I didn't check if thats what you called it). So you could just call that overload directly instean of ufbint_body.

jbathegit commented 2 years ago

Thanks for the suggestions @rmclaren, but no matter whether I call either ufbint_body or ufbint_4_d directly from ufbint_c, I'm still getting the same compiler error as above. It's just that now the error is occurring during the compilation of bufr_interface.f90, rather than during the compilation of ufbint.f90.

rmclaren commented 2 years ago

try making the line real(c_double), pointer, intent(inout) :: usr -> real(c_double), pointer, intent(inout) :: usr(:) in ufbint_isoc

edwardhartnett commented 2 years ago

Is the goal here to make a C API around the Fortran API?

jbathegit commented 2 years ago

@rmclaren thanks, but using the real(c_double), pointer, intent(inout) :: usr(:) syntax inside of ufbint_isoc didn't help either, and in fact generated a second compiler error in addition to the previous one:

/lfs/h2/emc/obsproc/noscrub/Jeff.Ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/src/bufr_interface.f90(221): error #8284: If the actual argument is scalar, the dummy argument shall be scalar unless the actual argument is of type character or is an element of an array that is not assumed shape, pointer, or polymorphic.   [USR]
  call ufbint_isoc(bufr_unit, f_data, dim_1, dim_2, iret, c_f_string(table_b_mnemonic))
-------^
/lfs/h2/emc/obsproc/noscrub/Jeff.Ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/src/bufr_interface.f90(221): error #6634: The shape matching rules of actual arguments and dummy arguments have been violated.   [F_DATA]
  call ufbint_isoc(bufr_unit, f_data, dim_1, dim_2, iret, c_f_string(table_b_mnemonic))
------------------------------^
compilation aborted for /lfs/h2/emc/obsproc/noscrub/Jeff.Ator/NCEPLIBS-bufr-GitHub/nceplibs-bufr/src/bufr_interface.f90 (code 1)

That said, and after a lot of proverbial blood, sweat and tears, I think I found an alternate approach using ufbint_body that will work, and which I'll push up in a bit so all of you can see it. It's not a perfect solution, because it doesn't work if I try to encapsulate the source of ufbint_body inside of a module, which means I have to expose it as a global reference. But otherwise it seems to do the trick and may be the best option. Again, I'll push it up in a bit, after I've had a chance to clean it up somewhat.

jbathegit commented 2 years ago

@edwardhartnett yes, the goal here is to have a clean C API around this particular Fortran API for the ufbint subroutine. This is one of the most important subroutines in the entire library, and there are C codes which now call it directly using the UNDERSCORE paradigm, which we've agreed we want to try to get rid of and thereby modernize the code.

So as I'm working to consolidate all of the separate _4, _8, and _d builds into one, and since we're going to continue to need to be able to call this particular subroutine from C codes, I'm trying to do that as well by taking advantage of all of the work that @rmclaren already did to write clean C interfaces for a number of Fortran library routines via bufr_interface.f90, and use that as a model going forward.

jbathegit commented 2 years ago

I just pushed up another commit to the onebuild branch. It handles all of the various Fortran interfaces for ufbint through a new module (just like for writsa as noted earlier in this thread), and including for some additional interfaces when ufbint is called with just a 1-d vector array or even just a scalar as argument usr. There are numerous existing application codes which do that, and even some library test codes which do it as well, and this new code handles those fine through a single ufbint interface, no matter the native compiled size of integers and reals within the application code.

The trick with ufbint, as noted above, was in also getting this to work when called from C using the existing bufr_interface.f90 wrapper. This is what kept generating compiler errors whenever I tried to use a module interface, and the only way I could get it to work was to move the source code for ufbint_body completely outside of the module, which means it now has global scope similar to what the source code for ufbint previously had. With that change, it now works.

Please see https://github.com/NOAA-EMC/NCEPLIBS-bufr/blob/onebuild/src/ufbint.f90 and https://github.com/NOAA-EMC/NCEPLIBS-bufr/blob/onebuild/src/bufr_interface.f90 for more details, and note from earlier that we'll also be able to simplify the ufbint.f90 code even further once we we're able to start building a single version of the entire library using the _8 flags. Specifically, we'll be able to remove all of the local my variables within all of the ufbint_8 subroutine variants and just pass the call arguments of those subroutines directly into ufbint_body.

jbathegit commented 2 years ago

@jack-woollen

Since PR #217 is now closed, let's resume that discussion in this issue thread as I suggested yesterday, rather than continuing the discussion in the PR #217 thread.

Overall, I really like the approach you're taking within your new r8wraps branch, and my only suggestions relate to style rather than functionality. Specifically, and going forward, we're really trying to use Doxygen-style documentation for all docblocks. We also want to make sure new code follows the indentation within the existing code (for overall readability) and use lower/mixed-case text rather than all upper-case within comments (so it doesn't look like we're shouting all the time ;-).

Since you asked me to look at atrcpt, I figured it would be easiest to just go ahead and push up a new commit where I've made the updates described above to what you had so far for that subroutine. Please do a pull to refresh your copy of the branch and let me know if you have any questions or concerns.

Otherwise, just a few additional points:

  1. It wasn't clear to me how you envision that an 8-byte application code would toggle the value of variable IM8 to .true. I'm assuming it wouldn't just directly "USE MODA_IM8B" within the application code but would instead call some new subroutine along the lines of datelen in order to do that, like we discussed earlier within the thread for #217. But I didn't see such a subroutine in your r8wraps branch - did you not create it yet, or did I just miss it? Either way, please update the docblock for atrcpt_8 accordingly with the answer, because I didn't know what to put there ;-)
  2. I was surprised to see that you didn't have a DIMENSION or similar declaration statement for the MSGIN and MSGOT arrays within atrcpt_8. Is that not needed, and if not then why do you have it in other similar subroutines such as cnved4_8? Whatever the correct approach is here, we should be consistent ;-)
  3. On a similar note, it looked to me like you had some statements in some of your 8-byte wrapper subroutines that really don't need to be there. For example, in datebf_8, why do you need to initialize the values of MEAR, MMON, MDAY, MHOUR, and IDATE prior to calling datebf? Those values are all output values from datebf, so whatever you initialize within datebf_8 for those values is just going to get overwritten within datebf anyway. Unless I'm missing something, I think you could just eliminate those statements altogether. Ditto for IRET within copysb_8 and gettagre_8, JTAB within getabdb_8, and possibly other cases as well.
jbathegit commented 2 years ago

Hi @jack-woollen , just wanted to touch base and see where you were with this. Have you had any more time to work on this, or do you need/want some help from me to get this knocked out?

Please let me know - thanks!

jbathegit commented 2 years ago

Per our email discussion, I started working on #79 and made some progress (see the new upgrade/C-Fortran branch), but CI testing clearly showed that it will make a lot more sense to finish the work for this issue first, and then proceed with #79 once we're down to just one build of the library which is natively compiled with 4-byte integers.

So I'll pick up where you left off with this issue in the r8wraps branch, as you suggested in your email. I'll keep you posted on my progress, and please do likewise if you manage to have any time of your own to work on this in the coming weeks.

jbathegit commented 2 years ago

Hey @jack-woollen just wanted to let you know that I was able to finish the updates noted above for the 40+ routines you had initially started. I included a new setimb8.f90 routine to set the imb8 value, and I renamed moda_imb8 to modv_imb8 since it's actually a value and not one or more arrays that are being stored there.

I also tweaked how the arguments are defined and passed between the i4 and i8 wrappers, because the previous approach ended up generating error diagnostics with the GNU compiler when I was testing a stripped-down prototype on WCOSS2. We just hadn't encountered that yet in this branch, since we're still building multiple copies of the library, so we hadn't yet actually tried to test an i8 application code with an i4 build of the library. But the way these arguments are now defined and passed (within the 40+ routines that have so far been modified in this branch) will work for all of the various compilers, once we do eventually drop down to just a single i4 build of the library.

Next step will be to make corresponding updates to all of the remaining 50+ routines that need to be similarly modified, based on the list you generated earlier.

jack-woollen commented 2 years ago

Hi Jeff

I pruned the original list a little to where it now has 79 entries of which I think 43 have been converted, leaving 36 to do. Updated complete list below.

Jack

atrcpt.f closbf.f closmg.f cnved4.f copybf.f copymg.f copysb.f cpdxmm.f cpymem.f datebf.f datelen.f drfini.f dumpbf.f dxdump.f fortran_open.f90 getabdb.f getcfmng.f gettagpr.f gettagre.f hold4wlc.f invmrg.f maxout.f mesgbc.f mesgbf.f minimg.f mtinfo.f nemdefs.f nemspecs.f openbf.f openbt.f openmb.f openmg.f pkvs01.f rdmemm.f rdmems.f rdmgsb.f readerme.f readlc.f readmg.f readmm.f readns.f readsb.f rtrcpt.f setblock.f setvalnb.f sntbbe.f sntbde.f sntbfe.f stndrd.f stntbia.f ufbcnt.f ufbcpy.f ufbcup.f ufbdmp.f ufbevn.f ufbget.f ufbin3.f ufbint.f ufbinx.f ufbmem.f ufbmex.f ufbmms.f ufbmns.f ufbovr.f ufbpos.f ufbqcd.f ufbqcp.f ufbrep.f ufbrms.f ufbseq.f ufbstp.f ufbtab.f ufbtam.f ufdump.f upds3.f writcp.f writlc.f writsa.f writsb.f

On Mon, Sep 26, 2022 at 11:41 PM Jeff Ator @.***> wrote:

Hey @jack-woollen https://github.com/jack-woollen just wanted to let you know that I was able to finish the updates noted above for the 40+ routines you had initially started. I included a new setimb8.f90 routine to set the imb8 value, and I renamed moda_imb8 to modv_imb8 since it's actually a value and not one or more arrays that are being stored there.

I also tweaked how the arguments are defined and passed between the i4 and i8 wrappers, because the previous approach ended up generating error diagnostics with the GNU compiler when I was testing a stripped-down prototype on WCOSS2. We just hadn't encountered that yet in this branch, since we're still building multiple copies of the library, so we hadn't yet actually tried to test an i8 application code with an i4 build of the library. But the way these arguments are now defined and passed (within the 40+ routines that have so far been modified in this branch) will work for all of the various compilers, once we do eventually drop down to just a single i4 build of the library.

Next step will be to make corresponding updates to all of the remaining 50+ routines that need to be similarly modified, based on the list you generated earlier.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1258927756, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6LSYYYNISTLL4OJJR3WAJUFPANCNFSM4TYNIUTQ . You are receiving this because you were mentioned.Message ID: @.***>

jbathegit commented 2 years ago

Some others for the list, just so we don't forget about them:

There may still be more, but these were the ones that immediately popped up in my head after reading through your initial list. Basically, the way I see it, any library subroutine or function that is directly callable from application codes, and which has at least one integer call argument (and/or integer return code) needs to have a wrapper.

jbathegit commented 2 years ago

On a related note, I don't think that any library subroutine or function which is never intended to be directly called from application codes should be on the list. Specifically, I noted

which I don't think should ever be called from outside of the library, and which I therefore don't think should need wrappers.

jbathegit commented 2 years ago

Saving a transcript of recent email discussions into this GitHub issue, for the record...

On Tue, Oct 18, 2022 at 3:49 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Hey Jack,

I agree we've probably considered enough "what ifs" to be able to move forward.  I'll continue working on this as well when I have time, but there are far fewer functions than subroutines, so I don't think it's fair to you that I only have to work on the former and not the latter.  So how about we both just work from your earlier list, and then we each just push our updates to the r8wraps branch on GitHub and use that forum to keep each other updated on where we're at?

-Jeff

On Tue, Oct 18, 2022 at 7:02 AM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:
Hey Jeff

That's a good one about the seg fault generated by x84 operating on a constant in the arg list. So another rule for input args is, to be "safe", always copy inputs, via x84, into a local 4byte scalar or array. I suppose this will work for a constant array in the arg list also. Probably don't see many of those. In any case I think we've looked at enough ins and outs to be fairly confident in moving forward with reworking the im8b blocks with the new strategy. I can work on this intermittently in the next month or so. One possible division of labor could be functions and subroutines. Since I already went through a lot of subroutines how about I continue with those, and you work on functions.

Jack

On Mon, Oct 17, 2022 at 5:09 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Hey Jack,

Now that I think about it some more, you're probably right that the LMSGOT assignment case would only work on a big-endian machine if it was declared locally as I*8.

You're also right that, for any input parameter, x84 could just remap the memory and x48 could just remap it back so there would be no "net" change to the input value.  However, we'd still have a problem if a calling program ever passed in a constant as that input parameter rather than a variable set to the same value, and in which case we'd end up with a segfault from trying to byte-shift protected memory.  In other words, we can't be sure a calling program will always pass in a value whose memory address we can modify.  So to be safe, I think we're always going to need to use x84 to copy any input parameter into a separate local array or scalar before making the recursive call.

On the output side, we could still pass any output parameter directly into the recursive call, but after doing so we'd need to make sure we always call x48 on any such parameter before we pass control back to the calling program, in case it's a big-endian machine.

Thanks,
-Jeff

On Fri, Oct 14, 2022 at 6:57 PM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:


On Fri, Oct 14, 2022 at 1:45 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Hey Jack,

I agree that using x84 with N=1 has no practical effect (i.e. "does nothing" ;-) on a little-endian machine like WCOSS2, but it seems to me it would if we were on a big-endian machine, because in that case the value of LO (i.e. the first byte of the low-order word) is 5, not 1.  Again, we need to make sure whatever changes we make to the library are endian-independent, and if we happened to be on a big-endian machine then there would be some byte-shifting involved.  So that's why I was trying to include that logic in my own prototype, because when you pass an I*8 integer across a subroutine interface in Fortran, what you're really passing is the address of the first byte of the underlying memory, so if that same memory is re-interpreted as I*4 on a big-endian machine, then you end up with the wrong half of the word if you don't use x84.

Absolutely right. So for endian sake we need to use x84 for all inputs to the recursive call.  And also x48 for all outputs back to the application, unless you use x84 to copy values to a separate array or scalar which won't be changed.

And that in turn brings up another possible bugaboo.  As you know, the idea behind x84 is so that we can remap an I*8 array as an I*4 array for use in the recursive call to the same subroutine inside of the IM8B wrapper.  However, if we happen to be on a big-endian machine, then x84 is actually shifting bytes, which means that we're modifying an array that was supposed to only be input to the subroutine.  In other words, if in a subroutine docblock we advertise an argument as being strictly "input", then we should never be modifying the underlying memory anywhere or for any purposes inside of that subroutine!

 If we use x84 going in and x48 going out it shouldn't be a problem. It would just change it and change it back, if the value wasn't changed by the recursive call. Alternatively, you could always use x84 to copy to a separate array (or scalar), then no need for x48 going out at all in that case, if there's no changes.

In the case of N=1 we can get around this problem by just using an assignment statement to a local variable inside of the subroutine (e.g. MY_LMSGOT = LMSGOT) and then pass that local variable (which is implicitly I*4) into the recursive subroutine call, because in that case it's an assignment by value (not by address), and so it will always work regardless of the endianness of the underlying machine.  But what if we were passing a contiguous array of I*8 integers into the subroutine as input?  In that case we'd need to remap the memory using x84, but if we do that then we're modifying an array that is strictly an input array!

Two things. One, I think the assignment only works if lmsgot is defined I*8. And two, see the response above.

The x-files give a lot of options, enough I think to handle any of the various situations presented by the subroutines modified for IM8B operation. It will certainly suffice for most, if not all, of them.

Thoughts?
-Jeff

On Fri, Oct 14, 2022 at 10:41 AM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:

As for x84 and x48 you're exactly right. X48 should be used in all cases when returning 8byte values from 4byte words. It is the equivalent of int8=int4. And x84 should be used for all 8byte word array input to 4byte arrays. In the case of scalar words no need to use x84 unless you want to copy the 8byte word into 4bytes with another name, i.e. call x84(i,j,1), as opposed to call x84(i,i,1). The latter case actually does nothing. The x-routines (x-files?) are generally able to work fine when the first 2 arguments are the same.

And for why ones are defined with -1, see directory /lfs/h2/emc/global/noscrub/Jack.Woollen/squawk/x84x48/bits. Output from running showbits (below) tells the story.

I'll start reworking subroutines on the list, asap.

Sincerely
Jack


+ ftn -g -traceback -o showbits.x showbits.f bits.f
+ showbits.x
 
 bits for int4=0
00000000000000000000000000000000 =      0
 
 bits for int8=0
0000000000000000000000000000000000000000000000000000000000000000 =      0
 
 bits for int4=-1
11111111111111111111111111111111 =     -1
 
 bits for int8=-1
1111111111111111111111111111111111111111111111111111111111111111 =     -1
 
 



On Thu, Oct 13, 2022 at 5:30 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Dear Jack,

I did look at your example, and I do appreciate you putting this together.  I can take care of setting the value of the logical "little" within CMake for the library, and I presume your plan going forward is that we would now call x48 whenever we need to shift the bytes for an I*4 to be interpreted as an I*8, and call x84 whenever we need to shift the bytes for an I*8 to be interpreted as an I*4, including for discrete (i.e. non-array) integer values?  In other words, for a single, discrete integer, we would just use N=1 as the last argument to x48 or x84, correct?

If so, then I agree we should be able to do recursive calls to the same subroutine within the wrappers, rather than having to have separate _8 subroutines.  And that's definitely a good thing, as we've already discussed.

On a more detailed note, and maybe I'm just being dense here, but why in x48 is the integer constant "ones" set to -1?  That was the one detail I couldn't quite follow.

Thanks,
-Jeff

On Wed, Oct 12, 2022 at 3:05 PM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:
Jeff

I don't think you looked at my example. All that is taken care of. Please look again.

Jack

On Wed, Oct 12, 2022 at 3:01 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Thanks Jack, but I don't know of any big-endian machines we can test on, and after diving down further into the weeds myself over the past couple of days, I'm not sure we can even keep the simplification we discussed earlier.  In other words, we may need to revert to our previous approach anyway, even for the cases where we're passing discrete integers.

As I'm sure you're aware, Fortran passes everything by address across subprogram interfaces, unless you explicitly specify the "value" attribute inside the subprogram, which isn't applicable here.  So even when you're just passing a single discrete integer, what you're really passing isn't the value, but the address of where the value is stored in memory.  So in the simple example of setting an I*8 integer to a value of 10 (=hex 0A) in the calling program, we have the following sequence of bytes reading left to right on a little-endian machine such as WCOSS2:

  byte1  byte2  byte3  byte4  byte5  byte6  byte7  byte8
|  0A  |  00  |  00  |  00  |  00  |  00  |  00  |  00  |

whereas on a big-endian machine we would instead have:

  byte1  byte2  byte3  byte4  byte5  byte6  byte7  byte8
|  00  |  00  |  00  |  00  |  00  |  00  |  00  |  0A  |

Now, and unless I'm missing something, when we pass either of these into a subroutine where the receiving variable is I*4, all that's happening is that the receiving variable receives the same starting address in memory and just interprets it as I*4.  So in the first case it maps to bytes 1-4 and gets the correct value of 10, but in the second case it gets the wrong value of 0.  In other words, the simplified logic we're using now only works because we happen to be testing on a little-endian machine, and it would fail if we were testing on a big-endian machine!  And the above diagram also explains the I*4 10-element array values I was seeing earlier for jdate_in, back when I was passing it an I*8 5-element array, and where on a big-endian machine we would have instead ended up with:

jdate_in(1) = 0
jdate_in(2) = 10
jdate_in(3) = 0
jdate_in(4) = 20
jdate_in(5) = 0
jdate_in(6) = 30
jdate_in(7) = 0
jdate_in(8) = 40
jdate_in(9) = 0
jdate_in(10) = 50

So bottom line, I think the only way we can really be safe is if we continue with the same approach we've been using up until now within the r8wraps branch, where we have a separate _8 subroutine (rather than recursively calling the same subroutine), and where all of the integer variables passed across a subprogram interface are explicitly declared as I*8 or I*4 and have the correct value explicitly stored in them outside of the subroutine interface (rather than relying on the subroutine to be able to correctly map the memory space or otherwise figure out what's what).  And that in turn should also take care of any endianness concerns.

If I'm missing something here, please let me know.  Again, I really appreciate all of your help with NCEPLIBS-bufr!


-Jeff


On Wed, Oct 12, 2022 at 11:07 AM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:
Jeff

I worked on the dumpbf problem, both the io array problem and the endian problem. Of course we need to run on a big endian platform to test that. But this approach should work there also. The directory /lfs/h2/emc/global/noscrub/Jack.Woollen/squawk/x84x48 contains the sample code for testing dumpbf. The script that runs it is called runt, main program runt.f. Check it out and see what it does.

Jack

On Fri, Oct 7, 2022 at 4:19 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Hey Jack,

If I take /lfs/h2/emc/obsproc/noscrub/jeff.ator/for_JWoollen_2/dumpbf.f and replace the line my_jdin(kk) = jdate_in((kk*2)-1) with CALL UPB( my_jdin(kk), 64, jdate_in, ibit ) and then compile and run with debugging on, I see that jdate_in has the following values inside of the IF(IM8B) block, since it's an I*8 5-element array in the calling program main.f and therefore an I*4 10-element array inside of dumpbf.f:

jdate_in(1) = 10
jdate_in(2) = 0
jdate_in(3) = 20
jdate_in(4) = 0
jdate_in(5) = 30
jdate_in(6) = 0
jdate_in(7) = 40
jdate_in(8) = 0
jdate_in(9) = 50
jdate_in(10) = 0

So I link to the libbufr_4 build of the library (since that's the single build that we're working towards, and so nbitw=32 and nbytw=4), and I call UPB with nbits = 64 and ibit = 0, 64, 128, 192, 256 for each of the kk=1,5 times through the loop, but I get back a value of 0 in each case for each of the my_jdin(kk) values each time through the loop.  I have to admit that I really struggle trying to follow all of the ISHFT, IREV and other bitwise logic deep down in the bowels of UPB and UPBB, so I'm not sure what I'm doing wrong here and would really appreciate if you could help me decipher what the problem may be.

Again, I think we need to have something that's endian-independent here rather than using straight assignment with the ((kk*2)-1) subscript notation, so thanks in advance for any help or guidance you can provide here.

Thanks,
-Jeff

On Fri, Oct 7, 2022 at 12:13 AM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:
Jeff
Yay, it does seem simpler. A couple things off the top of my bufr fried brain. I think if nbytw =8 it should not go into any "wrapper" block, regardless of the im8 setting. As for endian independence,  (eg nbytw=4 with im8=t) maybe we can just do byte reversal similar to what gets done to enable LE numbers to get packed properly into BE bufr bit strings.
Jack

On Thu, Oct 6, 2022 at 8:02 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Thanks Jack!  This looks great and should lead to much simpler library code in the long run.  In fact, based on what you've shown, and in cases where we're only passing discrete (i.e. non-array) integer values across subprogram interfaces, I think we can probably simplify even further by not even having any wrappers at all!  In other words, we'd only need wrappers for cases where:
  1. we need to change a value based on whether it's I*4 or I*8, such as in atrcpt.f when we need to do something like LMSGOT_4 = (LMSGOT_8 * 2) before recursively (re)calling the subprogram and passing in LMSGOT_4 instead of LMSGOT_8
  2. we need to pass an array of integers where each array member is its own discrete value, such as is the case with JDATE and JDUMP in subroutine dumpbf.f
But I think in all other subprograms we may not even need wrappers or recursive calls at all!  See my updated test program in /lfs/h2/emc/obsproc/noscrub/jeff.ator/for_JWoollen_2 for an example of what I'm talking about, and note here that there's no wrapper whatsoever needed for minimg.f since those arguments are all discrete (i.e. non-array) integers!

However, we still need one for dumpbf.f.  This is an example of the 2nd case noted above, and note that this situation is different from the case of the array of integers MSGIN we were looking at in atrcpt.f, and which was more analogous to the case of MBAY inside of the library where we're passing a contiguous BUFR message in memory across an array of integers.  In that case, we've already seen that the application program and called subroutine can each partition the array properly according to how it's defined within that particular block of code (i.e. whether it's windowed as I*8 array members or I*4 array members within that block) and return the expected values in each case.

However, in this case (for dumpbf.f), and unless I'm missing something, we still need a loop to individually copy each individual array member from I*8 arrays to I*4 arrays inside of the IF(IMB8) loop, and then back from I*4 arrays to I*8 arrays after the recursive call to DUMPBF.  The logic I have now works for both the Intel and GNU compilers; however, the ((kk*2)-1) subscript notation during the array copies is little-endian dependent, so a better approach would be to use logic that's endian-independent, e.g. replace the line my_jdin(kk) = jdate_in((kk*2)-1) with something like CALL UPB( my_jdin(kk), 64, jdate_in, ibit ), and replace the line jdump_out((kk*2)-1) = my_jdout(kk) with a similar call to UPB (or maybe UP8?)   But I couldn't quite get that to work, unless I missed something or wasn't quite calling it correctly, or else if you can think of a better way to do this?  Or maybe my brain is just fried from looking at this for most of the day? ;-)

I think this is the last case (i.e. an array of individual data values) that we need to figure out, and otherwise I think we're good to go with your new simplified approach.  Again, kudos to you for coming up with that - that's a great approach that seems to work nicely and is much simpler overall!  Please let me know if you can help come up with something endian-independent for the copy logic noted above, and note that if you do end up linking a copy of the library and using a call to UPB or something along those lines, then make sure you first include an explicit call to WRDLEN within this test version of DUMPBF, because otherwise you won't yet have nbitw, nbytw, etc. defined inside of the library! ;-)

Thanks,
-Jeff

On Wed, Oct 5, 2022 at 9:50 AM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:
Jeff

Thanks for the problem! I made some progress. Check out /lfs/h2/emc/global/noscrub/Jack.Woollen/squawk/rework. To run the test use atetest.

Jack

On Mon, Oct 3, 2022 at 12:45 PM Jeff Ator - NOAA Federal <jeff.ator@noaa.gov> wrote:
Sure thing Jack.  I just put together a short sample program which shows the error with the GNU compilers on WCOSS2.

See the code in /lfs/h2/emc/obsproc/noscrub/jeff.ator/for_JWoollen, which you can copy to your own area for testing.  If you run the compile script build.sh, you should see it squawk on line 17 of atrcpt.f with the following error message:

Error: Type mismatch in argument 'lmsgot_8' at (1); passed INTEGER(4) to INTEGER(8)

But if you then uncomment line 7 of atrcpt.f and re-run build.sh, it compiles cleanly, and you can then run the resulting a.out executable to see the expected output.

FWIW, the Intel compiler on WCOSS2 doesn't flag this, and it runs fine either way there.  But the library needs to work with the GNU compilers as well, so that's why I've now started using this approach in the r8wraps branch.

-Jeff

On Sat, Oct 1, 2022 at 1:35 PM Jack Woollen - NOAA Affiliate <jack.woollen@noaa.gov> wrote:
Jeff

I also tweaked how the arguments are defined and passed between the i4 and i8 wrappers, because the previous approach ended up generating error diagnostics with the GNU compiler when I was testing a stripped-down prototype on WCOSS2.

Maybe you can show me an example of the error you're talking about here. I'd like to take a look at it.

Thanks
Jack
jbathegit commented 2 years ago

@jack-woollen I copied a transcript of our recent email discussion to this GitHub issue, so we have a record of it here. And based on the outcome of that discussion, I also just updated the r8wraps branch to use X48 and X84 in all of the routines for which we've already developed I*8 wrappers. We've still got many more to go, but we're making progress!

Only detail I wanted to point out to you was in drfini.f. In that routine, we need to call X48 for both an array (MDRF) and for two scalars (LUNIT and NDRF), and the Gnu compiler squawks if you have different ranks for any argument between different calls to the same routine from within the same program block. So I just declared each of those two scalars as single-member arrays within drfini.f, and that resolved it. Take a look if you like and let me know if you have any questions.

jack-woollen commented 2 years ago

Jeff

My only question is does it work?

Jack

On Mon, Oct 24, 2022 at 2:17 PM Jeff Ator @.***> wrote:

@jack-woollen https://github.com/jack-woollen I copied a transcript of our recent email discussion to this GitHub issue, so we have a record of it here. And based on the outcome of that discussion, I also just updated the r8wraps branch to use X48 and X84 in all of the routines for which we've already developed I*8 wrappers. We've still got many more to go, but we're making progress!

Only detail I wanted to point out to you was in drfini.f. In that routine, we need to call X48 for both an array (MDRF) and for two scalars (LUNIT and NDRF), and the Gnu compiler squawks if you have different ranks for any argument between different calls to the same routine from within the same program block. So I just declared each of those two scalars as single-member arrays within drfini.f, and that resolved it. Take a look if you like and let me know if you have any questions.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1289412722, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6LSRZARAUKXY22FKYTWE3HERANCNFSM4TYNIUTQ . You are receiving this because you were mentioned.Message ID: @.***>

jbathegit commented 2 years ago

Jeff My only question is does it work? Jack

It did when I ran the CI tests for both Intel and Gnu. The drfini routine is called from within two of the CI test programs (test_OUT_4.f and test_OUT_6.f) in the test subdirectory of the library bundle, and they both passed.

jack-woollen commented 2 years ago

👍👍

On Mon, Oct 24, 2022 at 2:31 PM Jeff Ator @.***> wrote:

Jeff My only question is does it work? Jack

It did when I ran the CI tests for both Intel and Gnu. The drfini routine is called from within two of the CI test programs (test_OUT_4.f and test_OUT_6.f) in the test subdirectory of the library bundle, and they both passed.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1289429909, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6IEB6V3CADPUEYFVDLWE3IZPANCNFSM4TYNIUTQ . You are receiving this because you were mentioned.Message ID: @.***>

jbathegit commented 2 years ago

Hey @jack-woollen, maybe I'm missing something obvious, but from a user perspective what's the difference between copysb and ufbcpy? I see the former calls the latter for compressed subsets, but for documentation purposes I'd like to better understand why a user would call one vs. the other in the first place?

On a related note, I'd also like to better understand how ufbcup fits into the mix, and under what circumstances it would make sense for a user to call ufbcup instead of ufbcpy or copysb.

(Thanks! ;-)

jack-woollen commented 1 year ago

@Jeff Ator - NOAA Federal @.***>

Wow, you're digging into the olden days with these questions!

Re ufbcpy and copysb, ufbcpy copies the inv and val arrays from one open subset to another, whereas copysb does the equivalent of readsb and writsb.

Subroutine ufbcup was written to copy elements like ufbcpy does, except it can copy elements into an output subset that could be a superset of the input subset.

Welcome 😊

On Mon, Oct 24, 2022 at 5:36 PM Jeff Ator @.***> wrote:

Hey @jack-woollen https://github.com/jack-woollen, maybe I'm missing something obvious, but from a user perspective what's the difference between copysb and ufbcpy? I see the former calls the latter for compressed subsets, but for documentation purposes I'd like to better understand why a user would call one vs. the other in the first place?

On a related note, I'd also like to better understand how ufbcup fits into the mix, and under what circumstances it would make sense for a user to call ufbcup instead of ufbcpy or copysb.

(Thanks! ;-)

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1289648774, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6LQZ7EW35WT53DRS23WE36PFANCNFSM4TYNIUTQ . You are receiving this because you were mentioned.Message ID: @.***>

edwardhartnett commented 1 year ago

I hope all these historical details make it into the appropriate doxygen documentation, so they are captured for future users and maintainers, who will be working on this code long after we have all retired. ;-)

jbathegit commented 1 year ago

@edwardhartnett I'm trying to update the Doxygen documentation as I go, which is the main reason I asked those questions ;-)

@jack-woollen as I'm looking at ufbqcd and ufbqcp I'm seeing another potential issue. Each of those has a "real" as a calling argument (QCD is output from ufbqcd, whereas QCP is input to ufbqcp). But since it's not explicitly declared as "real*4" or "real*8" in either routine, then it seems we have a similar issue like we did before for integers, and where we ended up needing to use x48 and x84 to go back and forth between 4-byte and 8-byte values in order to be able to converge everything into a single library build.

In this case, there probably aren't too many codes out there which call either routine, so do you happen to know if those are currently compiled with 4-byte reals or 8-byte reals? If so, then maybe we just explicitly declare them accordingly within both of those routines - thoughts?

jbathegit commented 1 year ago

Or, another possible option...

By my understanding, QCP and QCD really only ever hold integer values anyway, so is there any reason we couldn't just change them to integers (IQCP and IQCD?) in these routines and then use x48 and x84? These updates are all going to be part of a new 12.0.0 (major version) release anyway, so there's an implicit understanding that folks may need to make some tweaks to their application codes to work with the updated library anyway, and so we could just document these real->integer type changes in the documentation and in the user guide and release notes so that everyone's aware.

Thoughts?

jack-woollen commented 1 year ago

@Jeff Ator - NOAA Federal @.***>

AFAIK, these codes, which were specifically written for prepobs codes, are only used by them, and are all compiled with 4 byte reals. It might make more sense to leave them be.

prepobs_cqcbufr.fd/cqcbufr.f:C BUFRLIB - CLOSBF OPENBF READMG UFBQCD CLOSMG COPYMG prepobs_cqcbufr.fd/cqcbufr.f: CALL UFBQCD(NFIN,'CQCHT',CQCPC) prepobs_cqcbufr.fd/cqcbufr.f: CALL UFBQCD(NFIN,'VIRTMP',VTPPC) prepobs_cqcbufr.fd/cqcbufr.f: CALL UFBQCD(NFIN,'RADCOR',RADPC) prepobs_prepdata.fd/prepdata.f:C 'UFBQCD' TO ENCODE PGM CODE FOR PREPDATA INTO PREPBUFR; CAN prepobs_prepdata.fd/prepdata.f:C UFBINT WRITSB UFBCNT UFBQCD DATELEN prepobs_prepdata.fd/prepdata.f: CALL UFBQCD(IUNITP,'PREPRO',PCODE) prepobs_prepacqc.fd/prepacqc.f:c - OPENBF CLOSBF UFBQCD SETBMISS GETBMISS prepobs_prepacqc.fd/prepacqc.f: call ufbqcd(outlun,'NRLACQC',nrlacqc_pc) prepobs_profcqc.fd/profcqc.f:C UFBQCD WRITSB UFBCNT SETBMISS GETBMISS prepobs_profcqc.fd/profcqc.f: CALL UFBQCD(NFIN,'CQCPROF',CQCPC) syndat_syndata.fd/syndata.f:C READSB UFBQCD UFBCPY UFBINT WRITSB syndat_syndata.fd/syndata.f: CALL UFBQCD(IUNTPO,'SYNDATA',SYNPC) gblevents.fd/gblevents.f:C> BUFRLIB - UFBINT UFBQCD GETBMISS IBFMS gblevents.fd/gblevents.f: CALL UFBQCD(IUNITP,'PREVENT',PVCD) gblevents.fd/gblevents.f: CALL UFBQCD(IUNITP,'VIRTMP ',VTCD) prepobs_prevents.fd/gblevents_cdas.f:C BUFRLIB - UFBINT UFBQCD GETBMISS prepobs_prevents.fd/gblevents_cdas.f: CALL UFBQCD(IUNITP,'PREVENT',PVCD) prepobs_prevents.fd/gblevents_cdas.f: CALL UFBQCD(IUNITP,'VIRTMP ',VTCD) prepobs_cqcvad.fd/cqcvad.f:C BUFRLIB - DATELEN CLOSBF OPENBF READMG UFBQCD prepobs_cqcvad.fd/cqcvad.f: CALL UFBQCD(IUNIT,'CQCVAD ',CQCPC) prepobs_glerladj.fd/glerlmain.f90:! readsb setbmiss ufbcnt ufbcpy ufbevn ufbint ufbpos ufbqcd prepobs_glerladj.fd/glerlmain.f90: call ufbqcd(13,'VIRTMP',vtcd) prepobs_glerladj.fd/glerlmain.f90: call ufbqcd(13,'GLERL',pcode) prepobs_oiqcbufr.fd/oiqcbufr.f:C OPENBF UFBQCD UFBINT WRITSB UFBCNT prepobs_oiqcbufr.fd/oiqcbufr.f: CALL UFBQCD(LUBIN,'OIQC',QCD)

On Thu, Oct 27, 2022 at 1:33 PM Jeff Ator @.***> wrote:

@edwardhartnett https://github.com/edwardhartnett I'm trying to update the Doxygen documentation as I go, which is the main reason I asked those questions ;-)

@jack-woollen https://github.com/jack-woollen as I'm looking at ufbqcd and ufbqcp I'm seeing another potential issue. Each of those has a "real" as a calling argument (QCD is output from ufbqcd, whereas QCP is input to ufbqcp). But since it's not explicitly declared as "real4" or "real8" in either routine, then it seems we have a similar issue like we did before for integers, and where we ended up needing to use x48 and x84 to go back and forth between 4-byte and 8-byte values in order to be able to converge everything into a single library build.

In this case, there probably aren't too many codes out there which call either routine, so do you happen to know if those are currently compiled with 4-byte reals or 8-byte reals? If so, then maybe we just explicitly declare them accordingly within both of those routines - thoughts?

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1293851768, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6IKLG75FZLVWARUT63WFK4FHANCNFSM4TYNIUTQ . You are receiving this because you were mentioned.Message ID: @.***>

jbathegit commented 1 year ago

Thanks Jack. I'm happy to leave them be if you're fairly confident that all of those codes are compiled using 4-byte reals.

Alternatively, did you see my other suggestion about maybe just converting QCD and QCP from reals to integers (IQCD and IQCP?) within those two routines and then using x48 and x84 like we do for other integers? Not sure why those variables may have even been declared as reals in the first place, since unless I'm missing something they really should only ever contain integer values anyway.

jack-woollen commented 1 year ago

I agree they should really be integers. Don't remember why they were made real in the first place. But since they are so specialized, I hesitate to make the prepobs folks have to make changes in all those codes just for that. Although, maybe it is a case of forcing some changes now so as not to get bitten later on.

On Thu, Oct 27, 2022 at 5:29 PM Jeff Ator @.***> wrote:

Thanks Jack. I'm happy to leave them be if you're fairly confident that all of those codes are compiled using 4-byte reals.

Alternatively, did you see my other suggestion about maybe just converting QCD and QCP from reals to integers (IQCD and IQCP?) within those two routines and then using x48 and x84 like we do for other integers? Not sure why those variables may have even been declared as reals in the first place, since unless I'm missing something they really should only ever contain integer values anyway.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS-bufr/issues/78#issuecomment-1294083771, or unsubscribe https://github.com/notifications/unsubscribe-auth/AO3XO6OT5DH2CDJIORZIFBTWFLX4DANCNFSM4TYNIUTQ . You are receiving this because you were mentioned.Message ID: @.***>

jbathegit commented 1 year ago

Hey @jack-woollen, just an FYI that I pushed another commit up to the r8wraps branch. The updates in this latest commit include starting to modify the actual test codes to link to the _4 build of the library, even when the test codes themselves are compiled using 8-byte integers. In other words, we still run three separate tests of each of the test codes compiled using _4, _8, and _d, but each of those tests now links to the _4 build of the library! So far everything's working well, but I did want to bring your attention to one minor issue that we seem to have overlooked.

Specifically, we've only been worrying about adding _8 wrappers to user-callable library routines which have integer call arguments. So, for example, we hadn't added _8 wrappers to iupbs01 or iupbs3 because they didn't have any integer call arguments which required using x84 and/or x48. However, this leads to a problem when such routines internally call other user-callable library routines which do contain such wrappers.

For example, iupbs01 internally calls gets1loc and i4dy, and iupbs3 internally calls getlens. And in those cases, if the application program previously called setim8b because it was compiled using 8-byte integers, then those internal calls to gets1loc, i4dy, and getlens end up getting made with im8b still set to .true., even though they're being internally called from within iupbs01 or iupbs3 which now themselves are only compiled as _4 as part of the library. And, in turn, this means that those internally-called routines are now themselves calling x84 and x48 when they shouldn't be. These errors quickly showed up when I began testing out some of the test code mods noted above.

So to resolve this, we need to make sure that any user-callable subroutine or function in the library contains an internal _8 wrapper, even if the routine itself doesn't have any integer call arguments. That way, we know that im8b will have been switched to .false. before any internal calls are made to any other user-callable library routines. You can see where I made this change to iupbs01 and iupbs3 as part of this commit.

jack-woollen commented 1 year ago

Testing is a wonderful thing! Sounds like a good solution. Clearly defining the 4byte and 8byte pathways through common subroutines. Kind of head spinning.