AgilentGCMS / ssim-ghg-2024

Code for SSIM-GHG 2024
GNU General Public License v3.0
5 stars 0 forks source link

Runtime error for filepaths not found? ('No suitable folder found with input data') #2

Open vwgeiser opened 2 weeks ago

vwgeiser commented 2 weeks ago

Hey all, new to this repository and trying to get the 4dvar/4dvar_demo.ipynb up and running and running into an issue with filepaths and 'Paths' class.

in convert_jacobian.py line 12:

        input_folder_order = [
            os.path.join(os.environ['HOME'], 'shared/ssim-ghg-data/inversion_examples'), # on GHG Center's JupyterHub
            os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/input'), # Sourish's laptop
            ]
        output_folder_order = [ # for each input folder above, define an appropriate output folder
            os.path.join(os.environ['HOME'], 'inversion_output'), # GHG Center's JupyterHub
            os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/output'), # Sourish's laptop
            ]

I first initially got a KeyError because my os.environ does not have a "HOME" key. It instead has a "os.environ['HOMEDRIVE']" (result of the call is: 'C:') and os.environ['HOMEPATH']" (result of the call is: '\Users\myusername').

After changing this to the correct absolute paths for my local machine:

        input_folder_order = [
            os.path.join(os.environ['HOME'], 'shared/ssim-ghg-data/inversion_examples'), # on GHG Center's JupyterHub
            os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "input"), # changed to MY laptop
            ]
        output_folder_order = [ # for each input folder above, define an appropriate output folder
            os.path.join(os.environ['HOME'], 'inversion_output'), # GHG Center's JupyterHub
            os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "output"), # changed to MY laptop
            ]

I now am running into the runtime error that is associated with the constructor call even when I know the input and output file paths are correct:

vwgeiser@MACHINENAME:/mnt/c/users/vwgei/documents/csu/schuh/s/ssim-ghg-2024/ssim-ghg-data/2024/input/jacobians$ ls
jacob_bgd_021624.nc   jacob_bgd_060524.nc   trunc_full_jacob_032624_with_dimnames_unit_pulse_4x5_mask.nc
jacob_bgd_021624.rda  jacob_bgd_060524.rda

I am also a Windows subsystem for linux user so this could be what is messing up the original os.environ['HOME'] call? I am unsure.

I'm sure if i have the paths pointing in the right place I'll be able to progress but wanted to bring this up.

AgilentGCMS commented 2 weeks ago

Are you running python on a windows or a linux platform?

Also, instead of changing one of the lines above, I would suggest adding a line each to input_folder_order and output_folder_order that reflect your choice of input and output folders. The code simply goes through those lists until it finds one that works.

jmineau commented 2 weeks ago

@AgilentGCMS Are the jacobians and other input files being hosted somewhere we have access to? I don't believe we have received any updates. Would be really nice to have as we prepare for AGU.

xref #1

vwgeiser commented 2 weeks ago

@AgilentGCMS I was trying to run the script in windows. When I switch to running the script in Linux it doesn't seem to have a problem with the os.envriron['HOME'] line, but still throws the runtime error with the correct absolute paths. I don't have a MAC so it could be the platform that this notebook was designed to run in, but I am still getting an error with both my Linux and Windows tests.

Oddly enough when I run it from my Linux subsystem, having the os.environ['HOMEDRIVE'] in there throws a KeyError as well.

# Code segment for Linux test

        input_folder_order = [
            os.path.join(os.environ['HOME'], 'shared/ssim-ghg-data/inversion_examples'), # on GHG Center's JupyterHub
            os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/input'), # Sourish's laptop
            os.path.join("mnt", "c", "users", "vwgei", "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "input"),
            # os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "input")
            ]
        output_folder_order = [ # for each input folder above, define an appropriate output folder
            os.path.join(os.environ['HOME'], 'inversion_output'), # GHG Center's JupyterHub
            os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/output'), # Sourish's laptop
            os.path.join("mnt", "c", "users", "vwgei", "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "output"),
            # os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "output")
            ]
andyjacobson commented 2 weeks ago

Macs are Unix systems that can be counted on to define $HOME. I recommend that you test for the existence of this variable in your environment and use an alternative if it does not exist. Of course I'd expect Python to have some system-agnostic means of accessing the home directory path, kind of like the path.join method.

-Andy

On 11/5/24 17:00, Victor Geiser wrote:

@AgilentGCMS https://github.com/AgilentGCMS I was trying to run the script in windows. When I switch to running the script in Linux it doesn't seem to have a problem with the |os.envriron['HOME']| line, but still throws the runtime error with the correct absolute paths. I don't have a MAC so it could be the platform that this notebook was designed to run in, but I am still getting an error with both my Linux and Windows tests.

Oddly enough when I run it from my Linux subsystem, having the |os.environ['HOMEDRIVE']| in there throws a KeyError as well.

|input_folder_order = [ os.path.join(os.environ['HOME'], 'shared/ssim-ghg-data/inversion_examples'), # on GHG Center's JupyterHub os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/input'), # Sourish's laptop os.path.join("mnt", "c", "users", "vwgei", "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "input"), # os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "input") ] output_folder_order = [ # for each input folder above, define an appropriate output folder os.path.join(os.environ['HOME'], 'inversion_output'), # GHG Center's JupyterHub os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/output'), # Sourish's laptop os.path.join("mnt", "c", "users", "vwgei", "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "output"), # os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "output") ] |

— Reply to this email directly, view it on GitHub https://github.com/AgilentGCMS/ssim-ghg-2024/issues/2#issuecomment-2458435773, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAL5MGHVUGSC6RCRBMB5W3LZ7FL2LAVCNFSM6AAAAABRFJ4DYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJYGQZTKNZXGM. You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Andy Jacobson he-him-his @.***

NOAA Global Monitoring Laboratory 325 Broadway R/GMD1 Boulder, Colorado 80305

‪(720) 310-5782‬

AgilentGCMS commented 2 weeks ago

@vwgeiser Can you paste the actual python traceback that is shown when an error is triggered?

vwgeiser commented 2 weeks ago

@AgilentGCMS

I apologize, I should have put this before. Traceback leads to the first line after imports in var4d_demo.ipynb

Linux test still doesn't find the input folder so the RuntimeError is thrown. (Running as a jupytext converted .py file) The .py vs. .ipynb is not what seems to be causing the issue.

var4d = Var4D_Components('only_noaa_observatories', verbose=True, store_intermediate=True)

(schuh) vwgeiser@MACHINENAME:/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d$ python var4d_demo.py
Traceback (most recent call last):
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_demo.py", line 35, in <module>
    var4d = Var4D_Components('only_noaa_observatories', verbose=True, store_intermediate=True) # change verbose to False to see fancy progress bars and suppress prints
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_components.py", line 457, in __init__
    super(Var4D_Components, self).__init__()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_components.py", line 47, in __init__
    super(RunSpecs, self).__init__()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/convert_jacobian.py", line 34, in __init__
    raise RuntimeError('No suitable folder found with input data')
RuntimeError: No suitable folder found with input data
(schuh) vwgeiser@MACHINENAME:/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d$

Then running the notebook in windows produces the KeyError associated with the os.environ['HOME']

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], [line 1](vscode-notebook-cell:?execution_count=2&line=1)
----> [1](vscode-notebook-cell:?execution_count=2&line=1) var4d = Var4D_Components('only_noaa_observatories', verbose=True, store_intermediate=True) # change verbose to False to see fancy progress bars and suppress prints
      [2](vscode-notebook-cell:?execution_count=2&line=2) flux_corr_structure = {'temp_corr': 2.0} # 2-month temporal correlation, no horizontal correlation
      [3](vscode-notebook-cell:?execution_count=2&line=3) obs_assim_dict = {'sites': ['mlo', 'spo', 'brw', 'smo']} # just the four observatories

File c:\Users\vwgei\Documents\CSU\SCHUH\ssim-ghg-2024\var4d\var4d_components.py:457, in Var4D_Components.__init__(self, project, *args, **kwargs)
    [456](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:456) def __init__(self, project, *args, **kwargs):
--> [457](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:457)     super(Var4D_Components, self).__init__()
    [458](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:458)     self.project = project
    [459](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:459)     self.verbose = kwargs['verbose'] if 'verbose' in kwargs else True

File c:\Users\vwgei\Documents\CSU\SCHUH\ssim-ghg-2024\var4d\var4d_components.py:47, in RunSpecs.__init__(self, *args, **kwargs)
     [46](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:46) def __init__(self, *args, **kwargs):
---> [47](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:47)     super(RunSpecs, self).__init__()
     [48](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:48)     self.start_date = datetime(2014,9,1)
     [49](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/var4d_components.py:49)     self.end_date = datetime(2016,9,1) # exclusive

File c:\Users\vwgei\Documents\CSU\SCHUH\ssim-ghg-2024\var4d\convert_jacobian.py:13, in Paths.__init__(self)
     [10](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:10) super(Paths, self).__init__()
     [11](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:11) # define the order in which paths will be searched for input data
     [12](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:12) input_folder_order = [
---> [13](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:13)     os.path.join(os.environ['HOME'], 'shared/ssim-ghg-data/inversion_examples'), # on GHG Center's JupyterHub
     [14](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:14)     os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/input'), # Sourish's laptop
     [15](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:15)     os.path.join("mnt", "c", "users", "vwgei", "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "input"),
     [16](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:16)     # os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "input")
     [17](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:17)     ]
     [18](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:18) output_folder_order = [ # for each input folder above, define an appropriate output folder
     [19](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:19)     os.path.join(os.environ['HOME'], 'inversion_output'), # GHG Center's JupyterHub
     [20](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:20)     os.path.join(os.environ['HOME'], 'Code/Teaching/DA Summer School/ssim-ghg-data/2024/output'), # Sourish's laptop
     [21](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:21)     os.path.join("mnt", "c", "users", "vwgei", "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "output"),
     [22](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:22)     # os.path.join(os.environ['HOMEDRIVE'], os.environ['HOMEPATH'], "Documents", "CSU", "SCHUH", "ssim-ghg-2024", "ssim-ghg-data", "2024", "output")
     [23](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:23)     ]
     [24](file:///C:/Users/vwgei/Documents/CSU/SCHUH/ssim-ghg-2024/var4d/convert_jacobian.py:24) font_order = ['Inconsolata', 'Calibri']

File <frozen os>:716, in __getitem__(self, key)

KeyError: 'HOME'
AgilentGCMS commented 2 weeks ago

@vwgeiser For the linux test, try /mnt as the first path component instead of mnt

AgilentGCMS commented 2 weeks ago

Although the issue of missing HOME can be solved on windows with os.path.expanduser('~'), which will resolve to the user's home directory on both windows and linux, I'm realizing that adding a line for each user's environment is probably not the best programming practice. Instead, I've created a new file called settings.ini.tmpl. Copy that over to settings.ini and supply paths. The settings.ini need not be version-controlled, just the template.

@aschuh @andyjacobson Can this structure of the ini file be read and parsed in R?

vwgeiser commented 2 weeks ago

@AgilentGCMS This worked! Thanks for the work to get this issue resolved, although make sure to specify that convert_jacobian.py is looking for a file named site_settings.ini and not just settings.ini.

I did however run into another error because i don't have any "fluxes" directory in my ssim-ghg-data/2024/input directory. Do I need to grab this data somewhere or run another script first before I run var4d_demo? or should it be there in which case I need to redownload the tarball of data you sent before?

(schuh) vwgeiser@VGROGZEPHG14:/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d$ python var4d_demo.py
Converting CT2022 to state vector:   0%|                                                         | 0/24 [00:00<?, ?it/s]
  Created true obs in  0.01s
Traceback (most recent call last):
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_demo.py", line 36, in <module>
    var4d.var4d_setup(obs_to_assim=obs_assim_dict, corr_structure=flux_corr_structure, **prior_flux_unc_dict)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_components.py", line 661, in var4d_setup
    self.setup_obs(true_flux='CT2022', obs_to_assim=obs_to_assim)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_components.py", line 579, in setup_obs
    state_vec = self.flux_cons.construct_state_vector_from_ct2022(smush_regions=False)
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_components.py", line 229, in construct_state_vector_from_ct2022
    flux_nee = self.read_ct2022_flux(year, month, 'bio_flux_opt')
  File "/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/var4d/var4d_components.py", line 129, in read_ct2022_flux
    with Dataset(file_name, 'r') as fid:
         ~~~~~~~^^^^^^^^^^^^^^^^
  File "src/netCDF4/_netCDF4.pyx", line 2521, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 2158, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/ssim-ghg-data/2024/input/fluxes/CT2022/CT2022.flux1x1.201409.nc'
AgilentGCMS commented 1 week ago

@vwgeiser Correct, that was a typo. The settings file is indeed called site_settings.ini.

Also, I have no idea why but the tarball I made is missing several crucial directories. I've made a new tarball, please re-download.

vwgeiser commented 1 week ago

@AgilentGCMS With the new tarball everything on the linux end looks great! Got it up and running with my system and was able to get through the entire var4d_demo.ipynb in Jupyter notebook. I ran into another file structure error when running the same code (with updated site_settings.ini) on my Windows machine. Would you like me to keep trying to work through the notebook/reporting errors on my Windows machine or will most of the users be using linux/osx?

This is to say that I did run into one error (most likely on my end) in that my favorite font Garamond was not supported... (haha!)

AgilentGCMS commented 1 week ago

Once we release this code, I don't think we can assume that most users will be *nix. So we should strive to make the code as cross-platform as possible. Having said that, if @vwgeiser your only error was that Garamond was not found, then I'm afraid that you'll need to do some platform-specific debugging. The issue may be that matplotlib doesn't know where to find Garamond. You can get the names of all font families you can use with matplotlib as follows.

from matplotlib import font_manager
set([f.name for f in font_manager.fontManager.ttflist])

However, I'm not entirely sure if ttflist also includes OTF or Type1 fonts you may have installed.

vwgeiser commented 1 week ago

@AgilentGCMS Thanks for the update, the font issue was certainly on my end. I now have the full working notebook in both windows and linux. In case it comes up again this is what my site settings looks like for both linux and windows:

site_settings.ini Windows

[paths]
# paths can have spaces, no quoting necessary
# input folder is where the folders 'jacobians', 'obs' and 'transcom' are
# input folder = /mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/ssim-ghg-data
input folder = C:\Users\vwgei\Documents\CSU\SCHUH\ssim-ghg-2024\ssim-ghg-data
# output folder is where you want output to be stored
# output folder = /mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/ssim-ghg-data
output folder = C:\Users\vwgei\Documents\CSU\SCHUH\ssim-ghg-2024\ssim-ghg-data

[plotting]
# this is very platform-dependent, uncomment and supply a font name (not path) to use custom font for all figure text
figure font = Garamond

site_settings.ini Linux

[paths]
# paths can have spaces, no quoting necessary
# input folder is where the folders 'jacobians', 'obs' and 'transcom' are
input folder = /mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/ssim-ghg-data
# input folder = C:\Users\vwgei\Documents\CSU\SCHUH\ssim-ghg-2024\ssim-ghg-data
# output folder is where you want output to be stored
output folder = /mnt/c/users/vwgei/documents/csu/schuh/ssim-ghg-2024/ssim-ghg-data
# output folder = C:\Users\vwgei\Documents\CSU\SCHUH\ssim-ghg-2024\ssim-ghg-data

[plotting]
# this is very platform-dependent, uncomment and supply a font name (not path) to use custom font for all figure text
figure font = Garamond