FREVA-CLINT / freva

The Free Evaluation System Framework (FreVa)
Other
10 stars 3 forks source link

modify the `add_output_to_databrowser` method #130

Closed eelucio closed 1 year ago

eelucio commented 1 year ago

currently it

  1. gathers all the metadata information for each file via drs_config and copies the file to the new folderpath
  2. and then indexes the file at the product_dir level instead at the file level: https://github.com/FREVA-CLINT/freva/blob/fe2df45999c5030bc55d0294186f2515e7a186b7/src/evaluation_system/api/plugin.py#L572

maybe modify so it does not a) do it at the product_dir level by default, b) does not clear the folderpath eachtime, c) better way??

If I want to, for example multithread the indexing of a bunch of files with different experiment_names I won't be able to as they are concurrently wiping all from the above level.

antarcticrainforest commented 1 year ago

133 should fix this. My idea was that users can just create the right data structure and then index whenever they think it's a good time. For example by using the UserData class after the new data files have been created.

from freva import UserData
user_data = UserData()
user_data.index()
eelucio commented 1 year ago

sthing I realised that might be unrelated is that when I multithread the add_output_to_databrowser() method, e.g:

def add_to_databrowser(self, config_dict, station_dir):    
    self.add_output_to_databrowser(
        station_dir,
        project="observations",
        product="station",
        model="DWD",
        institute="DWD",
        experiment=os.path.basename(station_dir),
        time_frequency=config_dict["time_frequency"],
    )
...    
with ThreadPoolExecutor(max_workers=max_workers) as executor:
  try:
      executor.map(lambda x : self.add_to_databrowser(config_dict, x), station_dirs) # where station_dirs are a list of directories
  except Exception as ex:
      traceback.print_exc()

https://www.xces.dkrz.de/history/924140/results/

Could not read git version
ERROR:freva:Could not read git version

all the threads show this message and no folder is even created before the ingestion