Open ghost opened 3 years ago
Hey @donuty-party Good effort here.
My initial thoughts are we may have an easier time handling any of our internal file structure through the python library pandas. Pandas may be good for us in general as well as it can help with data handling in a number of ways - data cleaning for example, summarizing etc.. I think in general the syntax of pandas would be good for us all to get some experience with as it comes up a fair bit in the bigger python data science community.
Perhaps we can add this as a topic for our Tuesday meeting so we can all have a change to discuss and learn from what you've done and what functionality you're proposing. People may benefit from seeing the pull request procedure general as well.
I have been foreseeing the need for a Tile object, wherein we can carry some information along with the array such as data source, extents, resolution, and some other meta data. You've pushed this along well.
@kevinkmcguigan, when you say our internal file structure should be handled through pandas, are you thinking that the directory-files structure should be replaced with a single file for each of Placemarks and TileSources, in something like CSV, to be ingested into a pandas data frame?
Even if this is the wrong direction for the project, writing something functional is helping me get more comfortable with Python, and with GitHub. At work, I didn't have many opportunities to use Python, and I assume I wouldn't have been allowed to use GitHub – I didn't use any revision control at all – so this is good.
Hey Jeff! I'm happy you're putting a fair bit of effort into this - I agree its a good place to learn things like this! You're pushing forward into some territory I don't typically take advantage of in python and git! I don't tend to use TOML config files for example. That's cool! Sure beats XML it seems! Learning and experience is good all around.
I know you've been putting a fair effort into this and its in a holding pattern. I may jump ahead of our Tuesday discussion on it merge in your changes shortly so that we can continue on with some of the image processing and machine learning parts. I am worried about some of the added complexity especially when it comes to saving and loading various models but we can debug that as we go perhaps.
I am worried about some of the added complexity especially when it comes to saving and loading various models but we can debug that as we go perhaps.
@kevinkmcguigan, I'm less sure myself about the model loading feature, since it makes assumptions I'm not sure will remain valid if we moved to Keras or something. The code itself isn't as good as the Placemark and TileSource code either. I'm going to take that out and put it into another branch.
We want the placemark and tile source functionality included - we will abandon the model handling script for now.
This would make core.py dependent on placemark.py, tilesource.py, and all of their config files. It would rip out the guts of getTile() and getTile_3x3(); these would wrap TileSource.tile() and TileSource.tile_3x3().
I'm using relative imports, so you can no longer just run the script with python core.py; you need to use python -m Library.core. If there's a way to avoid this, let me know.
This is major stuff, but the advantages, in my opinion, are: