Open-EO / openeo-processes-dask

Python implementations of many OpenEO processes, dask-friendly by default.
Apache License 2.0
19 stars 14 forks source link

Python doesn't find the 'core' and 'specs' modules when I try to import them #227

Closed automataIA closed 8 months ago

automataIA commented 8 months ago

Using the notebook 01_minibackend_demo.ipynb I tried to import the modules:

  1. core
  2. specs
  3. save_result
  4. load_collection

It tells me that they aren't there, in fact I can't find them among those available with the dir command. I tried doing various types of installation among those available, installing them and also testing the simple version, but with the same results:

!pip uninstall -y openeo-processes-dask
!pip install openeo-processes-dask
!pip install openeo-processes-dask[implementations, ml, experimental]

this is the structure of the files and directory of the openeo-processes-dask library, present in the directory of my venv ~/.pyenv/versions/3.9.18/envs/backend/lib/python3.9/site-packages/openeo_processes_dask

.
├── __init__.py
├── __pycache__
│   └── __init__.cpython-39.pyc
├── process_implementations
│   ├── __init__.py
│   ├── __pycache__
│   │   ├── __init__.cpython-39.pyc
│   │   ├── arrays.cpython-39.pyc
│   │   ├── comparison.cpython-39.pyc
│   │   ├── core.cpython-39.pyc
│   │   ├── data_model.cpython-39.pyc
│   │   ├── exceptions.cpython-39.pyc
│   │   ├── logic.cpython-39.pyc
│   │   ├── math.cpython-39.pyc
│   │   └── utils.cpython-39.pyc
│   ├── arrays.py
│   ├── comparison.py
│   ├── core.py
│   ├── cubes
│   │   ├── __init__.py
│   │   ├── __pycache__
│   │   │   ├── __init__.cpython-39.pyc
│   │   │   ├── _filter.cpython-39.pyc
│   │   │   ├── _xr_interop.cpython-39.pyc
│   │   │   ├── aggregate.cpython-39.pyc
│   │   │   ├── apply.cpython-39.pyc
│   │   │   ├── experimental.cpython-39.pyc
│   │   │   ├── general.cpython-39.pyc
│   │   │   ├── indices.cpython-39.pyc
│   │   │   ├── load.cpython-39.pyc
│   │   │   ├── mask.cpython-39.pyc
│   │   │   ├── mask_polygon.cpython-39.pyc
│   │   │   ├── merge.cpython-39.pyc
│   │   │   ├── reduce.cpython-39.pyc
│   │   │   ├── resample.cpython-39.pyc
│   │   │   └── utils.cpython-39.pyc
│   │   ├── _filter.py
│   │   ├── _xr_interop.py
│   │   ├── aggregate.py
│   │   ├── apply.py
│   │   ├── experimental.py
│   │   ├── general.py
│   │   ├── indices.py
│   │   ├── load.py
│   │   ├── mask.py
│   │   ├── mask_polygon.py
│   │   ├── merge.py
│   │   ├── reduce.py
│   │   ├── resample.py
│   │   └── utils.py
│   ├── data_model.py
│   ├── exceptions.py
│   ├── experimental
│   │   ├── __init__.py
│   │   └── __pycache__
│   │       └── __init__.cpython-39.pyc
│   ├── logic.py
│   ├── math.py
│   ├── ml
│   │   ├── __init__.py
│   │   ├── __pycache__
│   │   │   ├── __init__.cpython-39.pyc
│   │   │   ├── curve_fitting.cpython-39.pyc
│   │   │   └── random_forest.cpython-39.pyc
│   │   ├── curve_fitting.py
│   │   └── random_forest.py
│   └── utils.py
└── specs
    ├── __init__.py
    ├── __pycache__
    │   └── __init__.cpython-39.pyc
    └── openeo-processes
        ├── CHANGELOG.md
        ├── LICENSE
        ├── README.md
        ├── absolute.json
        ├── adaptive_threshold.json
        ├── add.json
        ├── add_dimension.json
        ├── aggregate_spatial.json
        ├── aggregate_temporal.json
        ├── aggregate_temporal_period.json
        ├── all.json
        ├── and.json
        ├── any.json
        ├── apply.json
        ├── apply_dimension.json
        ├── apply_kernel.json
        ├── arccos.json
        ├── arcosh.json
        ├── arcsin.json
        ├── arctan.json
        ├── arctan2.json
        ├── ard_normalized_radar_backscatter.json
        ├── ard_surface_reflectance.json
        ├── array_append.json
        ├── array_concat.json
        ├── array_contains.json
        ├── array_create.json
        ├── array_element.json
        ├── array_filter.json
        ├── array_find.json
        ├── array_labels.json
        ├── array_modify.json
        ├── arsinh.json
        ├── artanh.json
        ├── atmospheric_correction.json
        ├── between.json
        ├── ceil.json
        ├── clip.json
        ├── constant.json
        ├── cos.json
        ├── cosh.json
        ├── count.json
        ├── create_raster_cube.json
        ├── dimension_labels.json
        ├── divide.json
        ├── drop_dimension.json
        ├── e.json
        ├── eq.json
        ├── examples
        │   ├── array_contains_nodata.json
        │   ├── array_find_nodata.json
        │   └── rename-enumerated-labels.json
        ├── exp.json
        ├── extrema.json
        ├── filter_bands.json
        ├── filter_bbox.json
        ├── filter_labels.json
        ├── filter_spatial.json
        ├── filter_temporal.json
        ├── first.json
        ├── fit_curve.json
        ├── fit_regr_random_forest.json
        ├── floor.json
        ├── gt.json
        ├── gte.json
        ├── if.json
        ├── int.json
        ├── is_infinite.json
        ├── is_nan.json
        ├── is_nodata.json
        ├── is_valid.json
        ├── last.json
        ├── linear_scale_range.json
        ├── ln.json
        ├── load_collection.json
        ├── load_ml_model.json
        ├── load_stac.json
        ├── load_vector_cube.json
        ├── log.json
        ├── lt.json
        ├── lte.json
        ├── mask.json
        ├── mask_polygon.json
        ├── max.json
        ├── mean.json
        ├── median.json
        ├── merge_cubes.json
        ├── meta
        │   ├── implementation.md
        │   └── subtype-schemas.json
        ├── min.json
        ├── missing-processes
        │   ├── anomaly.json
        │   ├── apply_neighborhood.json
        │   ├── array_apply.json
        │   ├── climatological_normal.json
        │   ├── filter_bbox.json
        │   ├── rename_dimension.json
        │   ├── rename_labels.json
        │   ├── resample_spatial.json
        │   ├── run_udf.json
        │   ├── text_begins.json
        │   ├── text_contains.json
        │   ├── text_ends.json
        │   ├── text_merge.json
        │   └── trim_cube.json
        ├── mod.json
        ├── multiply.json
        ├── nan.json
        ├── ndvi.json
        ├── neq.json
        ├── normalized_difference.json
        ├── not.json
        ├── or.json
        ├── order.json
        ├── pi.json
        ├── power.json
        ├── predict_curve.json
        ├── predict_random_forest.json
        ├── product.json
        ├── proposals
        │   ├── aggregate_spatial_window.json
        │   ├── ard_normalized_radar_backscatter.json
        │   ├── ard_surface_reflectance.json
        │   ├── array_append.json
        │   ├── array_create_labeled.json
        │   ├── array_find_label.json
        │   ├── array_interpolate_linear.json
        │   ├── atmospheric_correction.json
        │   ├── cloud_detection.json
        │   ├── cummax.json
        │   ├── cummin.json
        │   ├── cumproduct.json
        │   ├── cumsum.json
        │   ├── date_shift.json
        │   ├── filter_labels.json
        │   ├── inspect.json
        │   ├── is_infinite.json
        │   ├── load_result.json
        │   ├── load_uploaded_files.json
        │   ├── resample_cube_temporal.json
        │   ├── run_udf_externally.json
        │   └── sar_backscatter.json
        ├── quantiles.json
        ├── rearrange.json
        ├── reduce_dimension.json
        ├── reduce_spatial.json
        ├── resample_cube_spatial.json
        ├── resample_spatial.json
        ├── round.json
        ├── sar_backscatter.json
        ├── save_ml_model.json
        ├── save_result.json
        ├── save_vector_cube.json
        ├── sd.json
        ├── sen2like.json
        ├── sgn.json
        ├── sin.json
        ├── sinh.json
        ├── sort.json
        ├── sqrt.json
        ├── subtract.json
        ├── sum.json
        ├── tan.json
        ├── tanh.json
        ├── tests
        │   ├── README.md
        │   ├── docs.html
        │   ├── examples.test.js
        │   ├── package.json
        │   ├── processes.test.js
        │   ├── subtypes-file.test.js
        │   ├── subtypes-schemas.test.js
        │   └── testHelpers.js
        ├── variance.json
        ├── vessel_detection.json
        └── xor.json
clausmichele commented 8 months ago

If I remember correctly the author of that notebook is @ValentinaHutter, could you please have a look at this?

automataIA commented 8 months ago

Found the solution for a part of the problem! The current library has a different layout of module functions than the one written in the notebook. this is the new import modules code:

from openeo_processes_dask.process_implementations import apply, ndvi, multiply, load_stac, core #, save_result
from openeo_processes_dask.process_implementations.core import process

the problem that remains is that it still can't find the save_result process

ValentinaHutter commented 8 months ago

A former colleague of mine created this example notebook - good to see that you already figured it out!

The save_result process is not available in this repository. This is because the process itself can be backend specific, depending on which formats the backends support, where the results should be stored (filepaths, filenames) and also what kind of input data the backend provided. (For specific CRS you might want a specific grid and naming convention.) Therefore, we did not include a general implementation here.

In your notebook, you could implement save_result by using data.to_netcdf(<filename>)

automataIA commented 8 months ago

A former colleague of mine created this example notebook - good to see that you already figured it out!

The save_result process is not available in this repository. This is because the process itself can be backend specific, depending on which formats the backends support, where the results should be stored (filepaths, filenames) and also what kind of input data the backend provided. (For specific CRS you might want a specific grid and naming convention.) Therefore, we did not include a general implementation here.

In your notebook, you could implement save_result by using data.to_netcdf(<filename>)

Thanks for your answer. I'm actually still trying to understand in general what the parts of the backend are (particularly python) and how they interact with each other. :cry:

ValentinaHutter commented 8 months ago

This repository as well as the openeo-pg-parser-networkx are the two parts that should handle the incoming process_graphs - see https://github.com/Open-EO/openeo-pg-parser-networkx/blob/main/README.md - the repositories do not include any code to load or to save any backend specific data. So, for the backend, you would need another repository where you implement your version of load_collection and save_result. As their specifications are available in openeo-processes, you can add them to a process registry. :) Also you can read more about process graphs in the API specification https://api.openeo.org/#section/Processes

clausmichele commented 8 months ago

For save_result, you could start from my implementation here: https://github.com/SARScripts/openeo_odc_driver/blob/dask_processes/openeo_odc_driver/processing.py

There's also load_collection, using opendatacube, but if you're using load_stac you don't need it.