ladybug-tools / dragonfly-grasshopper

:dragon: :green_book: Dragonfly plugin for Grasshopper.
GNU Affero General Public License v3.0
3 stars 2 forks source link

How Should We Import LANDSAT TIFF Images? #128

Open chriswmackey opened 5 years ago

chriswmackey commented 5 years ago

I know that we eventually want to support importing publicly available thermal satellite images from LANDSAT. They are very useful for understanding the temperature variations across cities and are particularly good for understanding radiant temperature variation (completing the picture with the uwg, which is primarily meant to show air temperature variation).

In the legacy dragonfly, I did a proof of concept with this using DOT NET libraries to get the TIFF pixel values but this method is clearly not ideal if we want to build a cross-platform library that allows us to process these images on any operating system or interface.

From what I can tell, there does not seem to be a simple way of getting TIFF pixel values without using a 3rd party module, even though all that we need is the pixel values and aren't doing any fancy operations on the images like transforms or filters. We don't even need to save back to the image but it seems no one makes an imaging library that is import only.

So it seems like a shame to add a 3rd party module that could inhibit applications in ironPython just to be able to read TIFF image pixel values.

As such, I am thinking of trying to find the parts of the Python Imaging Library (PIL) that are specifically related to grabbing pixel values. I can then try to make those parts cross platform and make them a sub-library of dragonfly.

If anyone has any better suggestions, please let me know.

Another possible idea is to have separate code for importing values for TIFF depending on whether you are in ironPython (in which case, you use the DOT NET library) as opposed to cPython (in which case, we might use pip install PIL?)

AntoineDao commented 5 years ago

Ah the joys of dependency hell! I would personally opt for a separation between IronPython and (real) Python. Iron Python implementations will move some of the grunt work (data processing/visualisation) to Grasshopper from the outputs of Dragonfly but will bottleneck these operations. As such the workflow will be different in scope and scale from what could be achieved using vanilla python (by this I mean larger scale parsing/processing than with Grasshopper).

Whichever way this project goes it's going to have to take on some technical dept be it by forking and home cooking image processing or managing differences in Python versions. I suspect that taking on some debt by segregating IronPython will be more beneficial/manageable in the long term if we add wrappers around image processing functions that abstract away the differences in implementation.

There is a risk of diverging capabilities in the future (eg: new file formats that IronPython can't process but Python + PIL can process) but I reckon that's a bridge that can be crossed when we get there. A possible solution will be implementing a dragonfly version of honeybee-worker.

chriswmackey commented 5 years ago

Thanks for your thoughts, @AntoineDao .

If I am understanding your suggestion, we could implement it similarly to how download_file works for the cPython ladybug core: https://github.com/ladybug-tools/ladybug/blob/master/ladybug/futil.py#L171-L243 ...and the ironPython ladybug plugin: https://github.com/ladybug-tools/ladybug-grasshopper/blob/master/ladybug/dotnet.py#L13-L48

This is going to be simpler to set up, though I worry about loosing control down the road if we have dependencies outside of the ladybug tools libraries.

For the cPython core, I might still keep a copy of the relevant parts of the PIL library with the dragonfly core. However, I won't worry about trying to get PIL to work with ironPython since we will rely on DOTNET for this part.

saeranv commented 5 years ago

Hey guys, I think the wrapper solution for a DOTNET and cPython image processing library is a good one. For the sake of completeness, I've brainstormed 3 ways to achieve a cross-platform solution for image processing.

In order of reducing dependencies:

  1. @AntoineDao's suggestion of using a image-processing wrapper, that will then call a DOTNET, or cPython library as required. It will look something like the download_file method.

    Pros: Simple solution that is already partially implemented. The fastest solution since you can move your IronPython objects to IronPython methods, and cPython objects to cPython methods. Cons: Need dependencies for both DOTNET and cPython image libraries == dependency hell.

  2. Send Grasshopper data to a cPython interpreter. This would be the honeybee-worker approach (I think). Basically if you are on the DOTNET platform, the component will make a subprocess call to a cPython interpreter, and then run the PIL.

    Pros: You can stick to just using the cPython image libraries, no need for the DOTNET library. You will need to use Pickle or Json to pass your image data from an IronPython interpreter to a cPython interpreter. Cons: A cPython interpreter would be a dependency. I believe this is how DIVA works, you install a python interpreter with the components.

  3. Create a REST API for image processing. Convert your image data into some sort of json string, send it as an http request to a server that can then use the cPython PIL for image processing, and then sends back data as a json string. -- Pros: No dependencies for the user. Cons: You need to run a server. You will need to convert your data to and from a json. This solution will take the longest time since we're adding calls to a server on top of the image processing.

Moving to a pure cPython solution like in 2 or 3 solves this problem for us in one go for all cross-compatibility issues. If you go the 1 route, you will always be creating dual DOTNET, cPython wrappers, and library calls whenever this problem crops up again. That being said, solution 1 is simpler, and faster.

-S

saeranv commented 5 years ago

Actually, I forgot there's another option already mentioned.

  1. Write your own python implementation that doesn't have any IronPython or cPython dependencies (i.e Scipy) so that it can work on all platforms.

    Pros: No dependencies, no infrastructure. Can take advantage of and contribute to libraries like ladybug-geometry which are built to be cross-platform in this fashion. Cons: Involves a lot of work for the developer in writing and maintaining one's own fork of a library. Potentially slower if it doesn't take advantage of the efficiencies built into cPython libraries (i.e using Numpy arrays instead of Python lists). Also limits ability to take advantage and built upon broader Python ecosystem.

-S