simpeg / aurora

software for processing natural source electromagnetic data
MIT License
13 stars 2 forks source link

MTH5 not closing if Aurora crashes #324

Open kujaku11 opened 2 months ago

kujaku11 commented 2 months ago

Somewhere in the Aurora pipeline there is an MTH5 file that is open and if Aurora crashes the MTH5 stays open, which is not ideal. It may be line 405 of aurora/pipelines/process_mth5: tfk.initialize_mth5s(). Perhaps a with statement here could close the MTH5 if something crashes. Or a try/except statement that catches errors, if an error is encountered call mth5.utils.helpers.close_open_files() then raise the error?

Maybe:

from mth5.helpers import close_open_files()
    try:
      # Initialize config and mth5s
      tfk = TransferFunctionKernel(dataset=tfk_dataset, config=config)
      tfk.make_processing_summary()
      tfk.show_processing_summary()
      tfk.validate()

      tfk.initialize_mth5s()

      msg = (
          f"Processing config indicates {len(tfk.config.decimations)} "
          f"decimation levels"
      )
      logger.info(msg)
      tf_dict = {}

      for i_dec_level, dec_level_config in enumerate(tfk.valid_decimations()):
          # if not tfk.all_fcs_already_exist():
          tfk.update_dataset_df(i_dec_level)
          tfk.apply_clock_zero(dec_level_config)

          stfts = get_spectrogams(tfk, i_dec_level, units=units)

          local_merged_stft_obj, remote_merged_stft_obj = merge_stfts(stfts, tfk)

          # FC TF Interface here (see Note #3)
          # Could downweight bad FCs here

          ttfz_obj = process_tf_decimation_level(
              tfk.config,
              i_dec_level,
              local_merged_stft_obj,
              remote_merged_stft_obj,
          )
          ttfz_obj.apparent_resistivity(tfk.config.channel_nomenclature, units=units)
          tf_dict[i_dec_level] = ttfz_obj

          if show_plot:
              from aurora.sandbox.plot_helpers import plot_tf_obj

              plot_tf_obj(ttfz_obj, out_filename="")

      tf_collection = TransferFunctionCollection(
          tf_dict=tf_dict, processing_config=tfk.config
      )

      tf_cls = tfk.export_tf_collection(tf_collection)

      if z_file_path:
          tf_cls.write(z_file_path)

      tfk.dataset.close_mth5s()
      if return_collection:
          # this is now really only to be used for debugging and may be deprecated soon
          return tf_collection
      else:
          return tf_cls
   except as error:
     close_open_files()
     raise ValueError(error)
kkappler commented 2 months ago

Thanks -- This is a good suggestion. I followed it mostly, but with a twist.

I renamed process_mth5 to process_mth5_legacy, and now process_mth5 calls process_mth5_legacy with a try_except.

This avoids an extra layer of indentation in the core processing method, but more importantly provides an entry point for other variations on processing from mth5 in future, i.e. a place to plug in other codes with other processing configs.

The fix is up on patches.

The tests seem to be passing will need to get past #325 before I can merge into main