ESMCI / cime

Common Infrastructure for Modeling the Earth
http://esmci.github.io/cime
Other
161 stars 206 forks source link

You can't work from a directory that has tests in the name, it confuses CIME. #4611

Closed jgfouca closed 4 months ago

jgfouca commented 6 months ago

See https://github.com/E3SM-Project/E3SM/pull/6207#issuecomment-2038071022

xylar commented 6 months ago

I have run into this a few times when I make a workdirectory with either tests as part of a subdirectory name. I get an error something like:

$ ./create_test MVKO_PS.T62_oQU240.GMPAS-NYF -g /lcrc/group/e3sm/ac.xylar/e3sm_baselines/test_20240404 --pesfile ../../cime_config/testmods_dirs/config_pes_tests.xml --wait

Testnames: ['MVKO_PS.T62_oQU240.GMPAS-NYF.chrysalis_intel']
Using project from config_machines.xml: e3sm
create_test will do up to 1 tasks simultaneously
create_test will use up to 160 cores simultaneously
Creating test directory /lcrc/group/e3sm/ac.xasay-davis/scratch/chrys/MVKO_PS.T62_oQU240.GMPAS-NYF.chrysalis_intel.G.20240404_102105_tts9dt
RUNNING TESTS:
  MVKO_PS.T62_oQU240.GMPAS-NYF.chrysalis_intel
Starting CREATE_NEWCASE for test MVKO_PS.T62_oQU240.GMPAS-NYF.chrysalis_intel with 1 procs
Finished CREATE_NEWCASE for test MVKO_PS.T62_oQU240.GMPAS-NYF.chrysalis_intel in 0.175126 seconds (FAIL). [COMPLETED 1 of 1]
    Case dir: /lcrc/group/e3sm/ac.xasay-davis/scratch/chrys/MVKO_PS.T62_oQU240.GMPAS-NYF.chrysalis_intel.G.20240404_102105_tts9dt
    Errors were:
        ERROR: Makes no sense to have empty read-only file: /gpfs/fs1/home/ac.xylar/e3sm_work/E3SM/mkstratos/tests/add-mpaso-mvk-test/driver-nuopc/cime_config/config_component.xml
xylar commented 6 months ago

We have also seen this a few other times in the past: https://github.com/E3SM-Project/E3SM/pull/5896#issuecomment-1703388921 https://github.com/E3SM-Project/E3SM/pull/5547#issuecomment-1483355470

xylar commented 5 months ago

@matthewhoffman just hit this issue:

$ ./create_test SMS_D_Ld1.T62_oEC60to30v3wLI_ais20.MPAS_LISIO_TEST.chrysalis_intel.mpaso-ocn_glcshelf
Testnames: ['SMS_D_Ld1.T62_oEC60to30v3wLI_ais20.MPAS_LISIO_TEST.chrysalis_intel.mpaso-ocn_glcshelf']
Using project from config_machines.xml: e3sm
create_test will do up to 1 tasks simultaneously
create_test will use up to 160 cores simultaneously
Creating test directory /lcrc/group/e3sm/ac.mhoffman/scratch/chrys/SMS_D_Ld1.T62_oEC60to30v3wLI_ais20.MPAS_LISIO_TEST.chrysalis_intel.mpaso-ocn_glcshelf.20240502_154158_9kugii
RUNNING TESTS:
  SMS_D_Ld1.T62_oEC60to30v3wLI_ais20.MPAS_LISIO_TEST.chrysalis_intel.mpaso-ocn_glcshelf
Starting CREATE_NEWCASE for test SMS_D_Ld1.T62_oEC60to30v3wLI_ais20.MPAS_LISIO_TEST.chrysalis_intel.mpaso-ocn_glcshelf with 1 procs
Finished CREATE_NEWCASE for test SMS_D_Ld1.T62_oEC60to30v3wLI_ais20.MPAS_LISIO_TEST.chrysalis_intel.mpaso-ocn_glcshelf in 0.553558 seconds (FAIL). [COMPLETED 1 of 1]
    Case dir: /lcrc/group/e3sm/ac.mhoffman/scratch/chrys/SMS_D_Ld1.T62_oEC60to30v3wLI_ais20.MPAS_LISIO_TEST.chrysalis_intel.mpaso-ocn_glcshelf.20240502_154158_9kugii
    Errors were:
        ERROR: Makes no sense to have empty read-only file: /gpfs/fs1/home/ac.mhoffman/e3sm-gis/E3SM-shelf-tests/driver-nuopc/cime_config/config_component.xml
xylar commented 5 months ago

It's a really common choice to have "test" or "tests" in the name of a directory you are developing tests in.

xylar commented 5 months ago

@jasonb5, I know you're a very busy guy but it would be a huge relief to have this fixed if you happen to find the time.

ekluzek commented 5 months ago

@xylar I don't think the problem is having test in directory names. The key error I see is this..

Errors were: ERROR: Makes no sense to have empty read-only file: /gpfs/fs1/home/ac.mhoffman/e3sm-gis/E3SM-shelf-tests/driver-nuopc/cime_config/config_component.xml

Which says that your config component file is empty. That's what you need to fix...

rljacob commented 5 months ago

"driver-nuopc"is also wrong for an E3SM case.

xylar commented 5 months ago

@ekluzek, renaming the directory to something without "test" makes the error go away every time. Having test(s) in the name has caused the issue every time. So I'm pretty convinced that's the issue.

jasonb5 commented 5 months ago

There's a fix coming https://github.com/ESMCI/cime/pull/4621, it's was due to how the model specific customization loading was too general in excluding files.

matthewhoffman commented 4 months ago

@ekluzek and @rljacob , removing "tests" from the directory name made the problem (the error + CIME defaulting to looking for a nuopc driver) go away. I changed "tests" to "testing" and things worked as expected. @jgfouca and @jasonb5 , thanks for taking care of this!

rljacob commented 4 months ago

I believe you. That's just a really odd fail mode for such a simple thing.

matthewhoffman commented 4 months ago

Yeah, it really is (was) - I'm very glad @xylar had identified this previously and saved me a lot of time hunting.