mjlosch / optim_m1qn3

optimisation package for MITgcm based on m1qn3 with proper reverse control
MIT License
8 stars 4 forks source link

End-of-file error when running optim.x #5

Closed DaniJonesOcean closed 3 years ago

DaniJonesOcean commented 3 years ago

Hi @mjlosch ,

Thanks for providing this optimisation code. I've followed the instructions in the README to run the "tutorial_global_oce_optim" case and to create an "optim.x" executable. However, when I try to run optim.x, it returns an end-of-file error.

Specifically, when I run the command ./optim.x > opt0.txt, I get the following error:

lib-5016 : UNRECOVERABLE library error 
  An EOF or EOD has been encountered unexpectedly.

Encountered during a sequential unformatted READ from unit 20
Fortran unit 20 is connected to a sequential unformatted  file:
  "ecco_cost_MIT_CE_000.opt0000"
Aborted

Have I set something incorrectly? Is there a setting that I should check somewhere? Thanks in advance for any help/guidance you can provide.

mjlosch commented 3 years ago

Thank for trying this code. Without further information, I can only guess that you probably need to adjust the Makefile for optim.x. I think as an MITgcm user you are used to (and spoiled by) the portability of binary files, but as the file ecco_cost_MIT_CE_000.opt0000 is not written by the mds package, this portability is harder to get. So I suggest to have look into the fortran options that your MITgcm-Makefile uses and try to find a similar set for the optim.x-Makefile. Things to look out for is wordlength, conversion to ieee-be, and similar.

mmazloff commented 3 years ago

Hello

I have had trouble with the line in optim_readdata.F:

 read( funit ) (nWetcGlobal(k), k=1,nr)

Replacing with

   do k = 1,nr
    read( funit ) tmp2r
    nWetcGlobal(k) = tmp2r
   enddo

fixed the issue. If its crashing during reading the header I would try this.

Similarly in optim_writedata I have

CMM( cmm write( funit ) (nWetcGlobal(k), k=1,nr) cmm write( funit ) (nWetsGlobal(k), k=1,nr) cmm write( funit ) (nWetwGlobal(k), k=1,nr) do k = 1,nr tmp2r = nWetcGlobal(k) write( funit ) tmp2r enddo do k = 1,nr tmp2r = nWetsGlobal(k) write( funit ) tmp2r enddo do k = 1,nr tmp2r = nWetwGlobal(k) write( funit ) tmp2r enddo CMM)

Probably this is compiler specific, but has happened to me on 2 different machines now

Matt

On Nov 10, 2020, at 7:57 AM, Martin Losch notifications@github.com wrote:

Thank for trying this code. Without further information, I can only guess that you probably need to adjust the Makefile for optim.x. I think as an MITgcm user you are used to (and spoiled by) the portability of binary files, but as the file ecco_cost_MIT_CE_000.opt0000 is not written by the mds package, this portability is harder to get. So I suggest to have look into the fortran options that your MITgcm-Makefile uses and try to find a similar set for the optim.x-Makefile. Things to look out for is wordlength, conversion to ieee-be, and similar.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/mjlosch/optim_m1qn3/issues/5*issuecomment-724794226__;Iw!!Mih3wA!XRayBAvKhhDi9SiRJPKBYD-_BX82d6jo8hExrYamoYnrPJzskk6VchOWvYBmIX01MA$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AIEXPOHZIVTSKHN5BEGGQYLSPFPF7ANCNFSM4TQXC7HA__;!!Mih3wA!XRayBAvKhhDi9SiRJPKBYD-_BX82d6jo8hExrYamoYnrPJzskk6VchOWvYAYKYjNiw$.

mjlosch commented 3 years ago

@mmazloff thanks for this. What you describe should also apply to the "standard" optim code in the MITgcm repository, because these routines are just a copy. Is that so? Do you think it is worth it to modify the writing and reading routines also in ctrlpack/unpack to avoid this kind of problem? Having said that, I believe that the entire pack/unpack code in pkg/ctrl, which is basically a "coupler" to the optim routine, chould have been put into the optimization routine in the first place. In that way it would have been so much easier to write the optimization routine in some high level language like Matlab or Python, with the interface constructed just by reading the adxx and writing the xx_files with rdmds/wrmds. What we have now, is a very complicated code that is difficult to port and this issue is another symptom of this mess.

mmazloff commented 3 years ago

Hi Martin

It has been many years since I used the “standard code”, but I don’t remember this happening. I suspect it would.

I agree that it would be nice if pack/linesearch/unpack was all ported to matlab or python!

Matt

On Nov 10, 2020, at 10:21 AM, Martin Losch notifications@github.com wrote:

@mmazloff https://urldefense.com/v3/__https://github.com/mmazloff__;!!Mih3wA!Ry_44pQNHepMDr5zWHBxQb2NMAjPUJLVN0AZYNvHONtqKUzYa-pTeQlEvRa0rI-7PQ$ thanks for this. What you describe should also apply to the "standard" optim code in the MITgcm repository, because these routines are just a copy. Is that so? Do you think it is worth it so modify the writing and reading routines also in ctrlpack/unpack to avoid this kind of problem? Having said that, I believe that the entire pack/unpack code in pkg/ctrl, which is basically a "coupler" to the optim routine, chould have been put into the optimization routine in the first place. In that way it would have been so much easier to write the optimization routine in some high level language like Matlab or Python, with the interface constructed just by reading the adxx and writing the xx_files with rdmds/wrmds. What we have now, is a very complicated code that is difficult to port and this issue is another symptom of this mess.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/mjlosch/optim_m1qn3/issues/5*issuecomment-724880118__;Iw!!Mih3wA!Ry_44pQNHepMDr5zWHBxQb2NMAjPUJLVN0AZYNvHONtqKUzYa-pTeQlEvRbWAqBP0w$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AIEXPODKEGHVFAWXARCC453SPGACFANCNFSM4TQXC7HA__;!!Mih3wA!Ry_44pQNHepMDr5zWHBxQb2NMAjPUJLVN0AZYNvHONtqKUzYa-pTeQlEvRb4aHgMow$.

ifenty commented 3 years ago

I agree that it would be nice if pack/linesearch/unpack was all ported to matlab or python!

This x1000, but… Python.

-Ian

From: Matt Mazloff notifications@github.com Reply-To: mjlosch/optim_m1qn3 reply@reply.github.com Date: Tuesday, November 10, 2020 at 10:33 AM To: mjlosch/optim_m1qn3 optim_m1qn3@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [mjlosch/optim_m1qn3] End-of-file error when running optim.x (#5)

Hi Martin

It has been many years since I used the “standard code”, but I don’t remember this happening. I suspect it would.

Matt

On Nov 10, 2020, at 10:21 AM, Martin Losch notifications@github.com wrote:

@mmazloff https://urldefense.com/v3/__https://github.com/mmazloff__;!!Mih3wA!Ry_44pQNHepMDr5zWHBxQb2NMAjPUJLVN0AZYNvHONtqKUzYa-pTeQlEvRa0rI-7PQ$ thanks for this. What you describe should also apply to the "standard" optim code in the MITgcm repository, because these routines are just a copy. Is that so? Do you think it is worth it so modify the writing and reading routines also in ctrlpack/unpack to avoid this kind of problem? Having said that, I believe that the entire pack/unpack code in pkg/ctrl, which is basically a "coupler" to the optim routine, chould have been put into the optimization routine in the first place. In that way it would have been so much easier to write the optimization routine in some high level language like Matlab or Python, with the interface constructed just by reading the adxx and writing the xx_files with rdmds/wrmds. What we have now, is a very complicated code that is difficult to port and this issue is another symptom of this mess.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/mjlosch/optim_m1qn3/issues/5*issuecomment-724880118__;Iw!!Mih3wA!Ry_44pQNHepMDr5zWHBxQb2NMAjPUJLVN0AZYNvHONtqKUzYa-pTeQlEvRbWAqBP0w$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AIEXPODKEGHVFAWXARCC453SPGACFANCNFSM4TQXC7HA__;!!Mih3wA!Ry_44pQNHepMDr5zWHBxQb2NMAjPUJLVN0AZYNvHONtqKUzYa-pTeQlEvRb4aHgMow$.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

mjlosch commented 3 years ago

@mmazloff do you think that your code version is more universal than what's there now? Does it make sense to add your code here, if only in form of a comment, or within ifdefs? If so, would you be willing to create a PR?

mmazloff commented 3 years ago

I’m no fortran expert, but it has proven less problematic for me. I can submit a PR.

Matt

On Nov 10, 2020, at 12:18 PM, Martin Losch notifications@github.com wrote:

@mmazloff https://urldefense.com/v3/__https://github.com/mmazloff__;!!Mih3wA!S0i5WQhPo3Ksix5n9gxqNBXu-t_X9iP76gFxIjEuxcKBzGN3tlfmz05vQlrn8INYNw$ do you think that your code version is more universal than what's there now? Does it make sense to add your code here, if only in form of a comment, or within ifdefs? If so, would you be willing to create a PR?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/mjlosch/optim_m1qn3/issues/5*issuecomment-724943132__;Iw!!Mih3wA!S0i5WQhPo3Ksix5n9gxqNBXu-t_X9iP76gFxIjEuxcKBzGN3tlfmz05vQlpsLQzleQ$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AIEXPOC2ICLL3MLZBICBGQLSPGN2PANCNFSM4TQXC7HA__;!!Mih3wA!S0i5WQhPo3Ksix5n9gxqNBXu-t_X9iP76gFxIjEuxcKBzGN3tlfmz05vQlqW-NRWvQ$.

DaniJonesOcean commented 3 years ago

Very good to see you all on this thread. Since @mmazloff is comfortable enough to admit that he's not a fortran expert, it's easier for me to admit that yes, @mjlosch, I have certainly been spoiled by MITgcm! I'm not a fortran expert either.

I am using the "archer" build options file with the cray compiler. Here is the build options file: https://github.com/MITgcm/MITgcm/blob/master/tools/build_options/linux_ia64_cray_archer

Based on your comments @mjlosch, I'm guessing that the most relevant line in the MITgcm cray build options file is: DEFINES='-DWORDLENGTH=4 -D_BYTESWAPIO -DHAVE_LAPACK'. Thanks in advance for your patience with a very basic question, but what would this look like in the optim.x-Makefile? What is it written in? The MITgcm file is a bash script.

I'm guessing that this is the line of the optim.X-Makefile that needs to be changed? FFLAGS = -h byteswapio -hnoomp -O0 -hfp0. Could you help me with the next couple of steps? Much appreciated as always.

mjlosch commented 3 years ago

It's important that the compiler that you use to compile optim_m1qn3 is the same as the one that you used to compile the MITgcm, also with the same basic flags (optimisation flags shouldn't matter), so in this case FFLAGS = -h byteswapio is important but -O0 -hpf0 is not. I'll make a note of that in the instructions.

DaniJonesOcean commented 3 years ago

Hi @mjlosch. After setting the optim_m1qn3 compiler flags to basically match my MITgcm compiler flags as you suggested, I was able to get optim_m1qn3 compiled and running. Thanks for your help!

I'll close this for now. Let me know if you'd like me to share my Makefile options for the benefit of other users. We could share it here in your repository, or I could instead share the Makefile options on the ARCHER2 MITgcm documentation.

ifenty commented 3 years ago

@DanJonesOcean I think it would be helpful to have your example Makefile options in the repo, with links in @mjlosch readme.md