Closed aazuspan closed 1 year ago
@grovduck I think I fixed this in the refactor
branch, but I'm still figuring out how best to set up Docker, so please let me know if you run into any issues like slow builds, missing/out of sync data, etc.
Also, have you noticed that Git changes aren't automatically refreshed in your Docker container, e.g. you make a change to a file and it's not marked as modified in VS Code? I'm running into that, but not sure if it's something in my local config or the Docker settings.
@aazuspan, sorry I've been so slow to review. Going through the notebooks now.
Also, have you noticed that Git changes aren't automatically refreshed in your Docker container, e.g. you make a change to a file and it's not marked as modified in VS Code? I'm running into that, but not sure if it's something in my local config or the Docker settings.
Yes, I'm experiencing the same behavior. For example, I created the .h5
file from the first notebook, but it didn't automatically refresh when created. When I did hit the refresh button up top, it did show up.
Also, as part of running the notebooks, I create modifications (random sampling, etc.), but those modifications are not being shown in VSCode. I can run git status
at command line and it shows as changed, but even a refresh of VSCode doesn't make it show up in the Source Control panel.
I think I fixed this in the
refactor
branch, but I'm still figuring out how best to set up Docker, so please let me know if you run into any issues like slow builds, missing/out of sync data, etc.
I'm pretty sure you haven't committed the Malheur_lidar_cancov.tif
file (in the data
directory) - that's the only file I've had to manually add. But perhaps I'm missing data that I should have? My rebuild of the container was very fast.
Yes, I'm experiencing the same behavior. For example, I created the .h5 file from the first notebook, but it didn't automatically refresh when created. When I did hit the refresh button up top, it did show up. Also, as part of running the notebooks, I create modifications (random sampling, etc.), but those modifications are not being shown in VSCode. I can run git status at command line and it shows as changed, but even a refresh of VSCode doesn't make it show up in the Source Control panel.
Good to know! I'm also having to manually refresh the Source Control tab to get changes to show up, so it sounds like that's probably a Docker problem. I opened #8 to track that.
Also, as part of running the notebooks, I create modifications (random sampling, etc.), but those modifications are not being shown in VSCode. I can run git status at command line and it shows as changed, but even a refresh of VSCode doesn't make it show up in the Source Control panel.
That's interesting, I'm not sure what could cause that... Does that only affect notebooks?
I'm pretty sure you haven't committed the Malheur_lidar_cancov.tif file (in the data directory) - that's the only file I've had to manually add.
Yes, that's on me! I was hesitating to start committing data or models because we probably won't end up being able to store everything on Github. Actually, as part of #5 I'm considering migrating all the LiDAR to EE to simplify the footprint sampling, at which point I think all of the data would be either remote or generated.
Also, as part of running the notebooks, I create modifications (random sampling, etc.), but those modifications are not being shown in VSCode. I can run git status at command line and it shows as changed, but even a refresh of VSCode doesn't make it show up in the Source Control panel.
That's interesting, I'm not sure what could cause that... Does that only affect notebooks?
Sorry, that wasn't clear at all. I just meant that it didn't initially show as a change in the Source Control tab (as you note), i.e. incrementing the number of files that had changed. The actual text is changing.
When the Docker image is rebuilt, all data in the project directory is included in the "build context". For me, this currently means that the build context is about 31 GB and takes half an hour to rebuild!
We need to make sure all data in the directory is available to the container so that we can train on it, but don't want to have to copy it all into the container like we're currently doing. This obviously isn't a unique problem, so there must be a solution.
This StackOverflow response looks like it's probably the answer:
Other references: