CIGLR-ai-lab / GreatLakes-TempSensors

Collaborative repository for optimizing the placement of temperature sensors in the Great Lakes using the DeepSensor machine learning framework. Aiming to enhance the quantitative understanding of surface temperature variability for better environmental monitoring and decision-making.
MIT License
0 stars 0 forks source link

Bug Report: `get_era5_reanalysis_data` function hanging indefinitely #29

Closed DaniJonesOcean closed 1 month ago

DaniJonesOcean commented 2 months ago

Description: The get_era5_reanalysis_data function in the DeepSensor package hangs indefinitely at the "Downloading ERA5 data from Google Cloud Storage..." step. This issue prevents further progress in sensor placement experiments for our GreatLakes-TempSensors project.

Steps to Reproduce:

  1. Call the get_era5_reanalysis_data function with relevant parameters. For example:

    from deepsensor import get_era5_reanalysis_data
    
    data = get_era5_reanalysis_data(
       var_IDs=['temperature', 'wind_speed'],  # Example variable IDs
       extent='north_america',
       date_range=('2000-01-01', '2005-12-31'),
       freq='D',
       num_processes=1,  # Default setting
       verbose=False,  # Default setting
       cache=False,  # Default setting
       cache_dir='.datacache'  # Default setting
    )
  2. Observe that the code hangs indefinitely and the progress bar remains stuck at 0%.

Suggested Fixes: Here are some suggestions provided by developer Tom Andersson to address the issue:

  1. Update DeepSensor Package: Ensure you are using the latest stable version of the DeepSensor package (I think we are):

    pip install --upgrade deepsensor

    Verify that you are using version v0.3.7 or later.

  2. Set num_processes Argument: Try setting the num_processes argument to 1 to avoid using multiprocessing:

    data = get_era5_reanalysis_data(
       var_IDs=['temperature', 'wind_speed'],
       extent='north_america',
       date_range=('2000-01-01', '2005-12-31'),
       freq='D',
       num_processes=1,  # Override default multiprocessing
       verbose=True,  # Enable verbose output to check process details
       cache=False,
       cache_dir='.datacache'
    )
  3. Verbose Mode: Enable verbose mode to check the number of processes being used:

    data = get_era5_reanalysis_data(
       var_IDs=['temperature', 'wind_speed'],
       extent='north_america',
       date_range=('2000-01-01', '2005-12-31'),
       freq='D',
       num_processes=1,
       verbose=True,  # Enable verbose output
       cache=True,  # Enable cache to speed up future runs
       cache_dir='.datacache'
    )
  4. Download Smaller Data Chunks: Attempt to download a smaller amount of ERA5 data instead of the default 20 years:

    data = get_era5_reanalysis_data(
       var_IDs=['temperature', 'wind_speed'],
       extent='north_america',
       date_range=('2001-01-01', '2002-12-31'),  # Smaller date range
       freq='D',
       num_processes=1,
       verbose=True,
       cache=False,
       cache_dir='.datacache'
    )
  5. Google Colab Authentication: If running in Google Colab, ensure proper authentication:

    from google.colab import auth
    auth.authenticate_user()

Additional Notes:

Slack Discussion: Below is the Slack thread where Tom and Dani discussed this issue. You can refer to this conversation for more context and additional troubleshooting steps.

https://establishingt-utt8550.slack.com/archives/C05NQ76L87R/p1720204678896329


Expected Behavior: The get_era5_reanalysis_data function should download the ERA5 data without hanging, and the progress bar should update appropriately.

Current Behavior: The function hangs indefinitely with no progress in downloading data.

DaniJonesOcean commented 1 month ago

Update: tried above, no luck. Still hangs, no files downloaded or saved.

DaniJonesOcean commented 1 month ago

One option would be to download ERA5 ourselves.

For ConvNP example:

For Active Learning example:

DaniJonesOcean commented 1 month ago

@eredding02 Good news! I downloaded a sample of ERA5 and was able to get the active learning notebook to run:

https://github.com/CIGLR-ai-lab/GreatLakes-TempSensors/blob/main/notebooks/05_dcj_active_learning_with_ERA5_sample.ipynb

I've given you access to the Google Drive folder with the sample data and the above Google Colab notebook - let me know if you run into any access issues.

Closing because we've resolved this by downloading some sample data ourselves