pangeo-data / scikit-downscale

Statistical climate downscaling in Python
https://scikit-downscale.readthedocs.io/en/latest/
Apache License 2.0
185 stars 47 forks source link

Update run_bcsd.py #140

Open richardcmckinney opened 7 months ago

richardcmckinney commented 7 months ago

• Code Consolidation: Identified that the run_ak and run_hi functions share significant similarities in structure and functionality. To streamline our code, these functions will be merged into a single run_processing function. This new function will accept parameters to accommodate the variations previously handled by the separate functions, thus eliminating redundancy.

• Optimized Data Handling: When working with large datasets, the use of xr.open_mfdataset without specifying chunk sizes (chunks=None) has been flagged as inefficient. To address this, intention to define appropriate chunk sizes, aiming to boost the performance and efficiency of data processing.

Code Clarity Enhancements: • Variable Naming and Documentation: Intention to improve variable names and supplement the code with detailed comments. This effort is intended to enhance both clarity and comprehension, making the code more accessible to current and future developers.

• Error Management: Basic error handling mechanisms have been introduced. These enhancements include validations for file operations and command-line arguments, aiming to increase the robustness of our code.

• Reusability and Maintenance: By consolidating the run_ak and run_hi functions into a unified run_processing function, which consequently will not only eliminate duplicate code but also lay the groundwork for easier maintenance and future enhancements.

• Performance Improvements: By implementing chunking in xr.open_mfdataset, anticipate significant improvements to handling and processing speeds for large datasets.

• Readability and Maintenance: Through better variable naming and comprehensive commenting, we strive to elucidate the purpose and functionality of each segment of our code, thereby facilitating easier maintenance and updates.

• Adaptability: The refactored code is designed with flexibility in mind, making it more capable of accommodating changes in dataset attributes or processing demands.

This refactoring initiative is driven by our commitment to efficiency, maintainability, and clarity, reflecting our dedication to innovation and high standards in coding practices.

richardcmckinney commented 7 months ago

Looks like the build error encountered occurs during the preparation of package metadata from a 'pyproject.toml' file, specifically while attempting to install a Python package using Poetry as the build system.

The error message suggests that there's a value within the pipfile_deprecated_finder extras that does not match this pattern, which causes the metadata generation process to fail. This failure prevents the installation process from proceeding.

To remedy, anticipated remediation likely to entail:

  1. Review the pyproject.toml file, particularly the extras section that is mentioned in the error message.

  2. Ensure that all entries under pipfile_deprecatedfinder (or any other relevant section pointed out by the error) conform to the specified pattern ^[a-zA-Z-.0-9]+$. This means removing or correcting any values that contain characters outside of this allowed set.