NASA-IMPACT / veda-pforge-job-runner

Apache Beam + EMR Serverless Job Runner for Pangeo Forge Recipes
2 stars 2 forks source link

Failing: TRMM #16

Closed ranchodeluxe closed 9 months ago

ranchodeluxe commented 9 months ago

https://github.com/pangeo-forge/staged-recipes/pull/261/files

pangeo-forge-runner bake \
    --repo=https://github.com/ranchodeluxe/staged-recipes \
    --ref="trmm-3b42-daily-gcorradini" \
    --Bake.feedstock_subdir="recipes/trmm" \
    -f config.py 
pangeo-forge-runner bake \
    --repo=https://github.com/ranchodeluxe/staged-recipes \
    --ref="trmm-3b42-daily-gcorradini" \
    --Bake.feedstock_subdir="recipes/trmm" \
    -f config.py 
curl -X POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
-H "Authorization: token blablah" \
https://api.github.com/repos/NASA-IMPACT/veda-pforge-job-runner/actions/workflows/job-runner.yaml/dispatches \
-d '{"ref":"cmr-input", "inputs":{"repo":"https://github.com/ranchodeluxe/staged-recipes","ref":"trmm-3b42-daily-gcorradini","prune":"1","feedstock_subdir": "recipes/trmm"}}'
curl -X POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
-H "Authorization: token blablah" \
https://api.github.com/repos/NASA-IMPACT/veda-pforge-job-runner/actions/workflows/job-runner.yaml/dispatches \
-d '{"ref":"cmr-input", "inputs":{"repo":"https://github.com/ranchodeluxe/staged-recipes","ref":"trmm-3b42-daily-gcorradini","prune":"0","feedstock_subdir": "recipes/trmm"}}'
ranchodeluxe commented 9 months ago

This whole token flow with Earthdata is broken and needs to be revisted: {'statusCode': 500, 'error': 'Internal Server Error', 'message': 'An internal server error occurred'}

ranchodeluxe commented 9 months ago

Could not use my EARTHDATA user and tokens it seems. Had to make quite a few adjustments to the recipe based on this newly merged PR and some other bad type checking feedback. But local runs still producing this:

Traceback (most recent call last):                                                                                                                                                                                                                    [0/33755]
  File "/home/ubuntu/venv_310/bin/pangeo-forge-runner", line 8, in <module>                                                                                                                               
    sys.exit(main())                                                                                                                                                                                  
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/pangeo_forge_runner/cli.py", line 28, in main                                                                                                  
    app.start()                                                                                                                                                                                       
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/pangeo_forge_runner/cli.py", line 23, in start                                                                                                 
    super().start()                                                                                                                      
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/traitlets/config/application.py", line 474, in start                                                                                                                                
    return self.subapp.start()                                                                                                                                 
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/pangeo_forge_runner/commands/bake.py", line 328, in start                                                                                                                           
    pipeline.run()                                                                                                                                             
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/pipeline.py", line 585, in run                                                                                                                                          
    return self.runner.run_pipeline(self, self._options)                                                                                 
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/direct/direct_runner.py", line 128, in run_pipeline                                                                                                             
    return runner.run_pipeline(pipeline, options)                                                                                          
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 202, in run_pipeline              
    self._latest_run_result = self.run_via_runner_api(                                                                                                                                                
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 224, in run_via_runner_api        
    return self.run_stages(stage_context, stages)                                                                                                                                                     
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 455, in run_stages                                                           
    bundle_results = self._execute_bundle(                                                                                                 
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 783, in _execute_bundle                                                                                                           
    self._run_bundle(                                                                                                                    
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1020, in _run_bundle                                                                                                              
    result, splits = bundle_manager.process_bundle(                                                                                      
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1462, in process_bundle           
    for result, split_result in executor.map(execute, zip(part_inputs,  # pylint: disable=bad-option-value                               
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator                                                                         
    yield _result_or_cancel(fs.pop())                                                                                                    
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel                                                                       
    return fut.result(timeout)                                                                                                           
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result                                                            
    return self.__get_result()                                                                                                           
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result                                                                            
    raise self._exception                                                                                                                  
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/utils/thread_pool_executor.py", line 37, in run                   
    self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs))                                                                   
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1454, in execute                  
    return bundle_manager.process_bundle(                                                                                                                                                             
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1395, in process_bundle                                                                                                           
    raise RuntimeError(result.error)                                                                                                                                                                      
RuntimeError: Traceback (most recent call last):                                                                                                               
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 297, in _execute                                                                               
    response = task()                                                                                                                                          
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 372, in <lambda>                                                                               
    lambda: self.create_worker().do_instruction(request), request)                                                                                                                                    
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 625, in do_instruction                                                                         
    return getattr(self, request_type)(                                                                                                                                                               
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/worker/sdk_worker.py", line 663, in process_bundle                                                                         
    bundle_processor.process_bundle(instruction_id))                                                                                                                                                  
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/worker/bundle_processor.py", line 1056, in process_bundle                                                                  
    input_op_by_transform_id[element.transform_id].process_encoded(                                                                                                                                   
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/runners/worker/bundle_processor.py", line 237, in process_encoded                                                                  
    self.output(decoded_value)                                                                                         
  File "apache_beam/runners/worker/operations.py", line 570, in apache_beam.runners.worker.operations.Operation.output                                                                                                                         
  File "apache_beam/runners/worker/operations.py", line 572, in apache_beam.runners.worker.operations.Operation.output                                                                                                                                         
  File "apache_beam/runners/worker/operations.py", line 263, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive                                                                                                      
  File "apache_beam/runners/worker/operations.py", line 266, in apache_beam.runners.worker.operations.SingletonElementConsumerSet.receive                                                                                                                      
  File "apache_beam/runners/worker/operations.py", line 1162, in apache_beam.runners.worker.operations.CombineOperation.process                                                                                                                
  File "apache_beam/runners/worker/operations.py", line 1166, in apache_beam.runners.worker.operations.CombineOperation.process                                                                                                                                
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/transforms/combiners.py", line 901, in merge_only                                                                                                                       
    return self.combine_fn.merge_accumulators(accumulators)                                                            
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/pangeo_forge_recipes/combiners.py", line 91, in merge_accumulators                                                                                                                  
    references = [a.translate() for a in accumulators]                                                                         
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/pangeo_forge_recipes/combiners.py", line 91, in <listcomp>                                                                                                                                          
    references = [a.translate() for a in accumulators]                                                                         
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/kerchunk/combine.py", line 496, in translate                                                                                                                                                        
    self.first_pass()                                          
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/kerchunk/combine.py", line 258, in first_pass                                                                                                                                                       
    value = self._get_value(i, z, var, fn=self._paths[i])                                                                      
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/kerchunk/combine.py", line 226, in _get_value                                                                                                                                                       
    o = z[selector.split(":", 1)[1]][...]                                                                                      
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/zarr/hierarchy.py", line 500, in __getitem__                                                                                                                                                        
    raise KeyError(item)                                       
KeyError: 'time' 
ranchodeluxe commented 9 months ago

I think some of these files are missing a time dimension?

  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/apache_beam/transforms/combiners.py", line 901, in merge_only
    return self.combine_fn.merge_accumulators(accumulators)
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/pangeo_forge_recipes/combiners.py", line 91, in merge_accumulators
    references = [a.translate() for a in accumulators]
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/pangeo_forge_recipes/combiners.py", line 91, in <listcomp>
    references = [a.translate() for a in accumulators]
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/kerchunk/combine.py", line 496, in translate
    self.first_pass()
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/kerchunk/combine.py", line 258, in first_pass
    value = self._get_value(i, z, var, fn=self._paths[i])
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/kerchunk/combine.py", line 226, in _get_value
    o = z[selector.split(":", 1)[1]][...]
  File "/home/ubuntu/venv_310/lib/python3.10/site-packages/zarr/hierarchy.py", line 500, in __getitem__
    raise KeyError(item)
KeyError: 'time'
ranchodeluxe commented 9 months ago

Given some of the Kerchunk preprocessor issues talked about on this Slack thread: the next step is to try to when I get a chance:

OpenUrlWithFsSpec | OpenWithXarray | ManuallyAddDataVariableTimeDoFnReturnOpenWithKerchunkInline | <rest of recipes>
abarciauskas-bgse commented 9 months ago

@ranchodeluxe should we close this (at least for now) since, as we discussed this morning, we are not going to target this dataset?