SatelliteShorelines / CoastSeg

An interactive toolbox for downloading satellite imagery, applying image segmentation models, mapping shoreline positions and more. The mapping extension for CoastSat and Zoo.
https://satelliteshorelines.github.io/CoastSeg/
GNU General Public License v3.0
49 stars 10 forks source link

Create a guide on how to filter data #146

Closed 2320sharon closed 1 year ago

2320sharon commented 1 year ago

Title: Allow Removal of Files from the RGB Directory in the Preprocessed Folder

Description: This issue addresses the need for users to be able to remove files from the data/roi_id/jpg_files/preprocessed/RGB directory in order to filter out unnecessary files. It is important to note that this operation does not delete any TIFF files, but rather helps to improve the efficiency of shoreline extraction and enhance the quality of the extracted shorelines. Additionally, users are recommended to move any problematic imagery to a designated subdirectory named 'bad'.

Steps to implement the file removal feature:

  1. Create a new subdirectory named 'bad' within the data/roi_id/jpg_files/preprocessed directory if it does not already exist.
  2. Identify the files within the data/roi_id/jpg_files/preprocessed/RGB directory that you want to remove.
  3. Copy the files you wish to remove and paste them into the 'bad' subdirectory created in step 1.
  4. Confirm that the files are successfully copied to the 'bad' subdirectory.
  5. Once you have verified that the files are safely stored in the 'bad' subdirectory, you can safely delete the copied files from the data/roi_id/jpg_files/preprocessed/RGB directory.
  6. Make sure to double-check that the correct files are selected for removal before deleting them.
  7. Test the shoreline extraction process to verify that the removal of files from the data/roi_id/jpg_files/preprocessed/RGB directory has improved the efficiency and quality of the extracted shorelines.

Please note that this feature only allows users to remove files from the specified directory and move them to the 'bad' subdirectory. It does not delete any TIFF files or affect other directories within the project.

2320sharon commented 1 year ago

I've written version one in the wiki

2320sharon commented 1 year ago

I've added more screenshots and extra instructions

2320sharon commented 1 year ago
venuswku commented 1 year ago

Here's my feedback on the guide so far:

  1. I liked how you included how to find the right ROI directory from a session config file! That's something handy I didn’t know was provided in the configs.
  2. Tiny little nitpick: screenshot of where to find the ROI directory name is from a config file in a data directory. I feel like it would be better if it was a config file from a session directory.
  3. Little typo: replacing the with that
    1. Remove the files that you don't like
  4. It would be helpful to include a link to the wiki page about the shoreline extraction process for step 5.
  5. Running the 4-class RGB model (with the Unet notebook) on my filtered RGB images went well, but when I tried extracting shorelines I got this error:
    
    Extracting shorelines. Please wait.
    Retrieving: https://zenodo.org/record/7814755/files/global_shoreline_5deg_1022.geojson?download=1
    Retrieving file: C:\Users\Venuxk\anaconda3\envs\coastseg2\lib\site-packages\coastseg\shorelines\global_shoreline_5deg_1022.geojson
    Downloading global_shoreline_5deg_1022.geojson: 100%
    19.2M/19.2M [00:09<00:00, 1.73MB/s]
    [                                        ] | 0% Completed | 1.51 ms
    Mapping Shorelines for L8: 100%
    91/91 [00:00<00:00, 550.00it/s]
    [########################################] | 100% Completed | 3.26 ss
    extract_shorelines_with_dask took 3.278663 seconds to run.
    ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    File ~\anaconda3\envs\coastseg2\lib\site-packages\ipywidgets\widgets\widget_output.py:103, in Output.capture.<locals>.capture_decorator.<locals>.inner(*args, **kwargs)
    101     self.clear_output(*clear_args, **clear_kwargs)
    102 with self:
    --> 103     return func(*args, **kwargs)

File ~\anaconda3\envs\coastseg2\lib\site-packages\coastseg\models_UI.py:572, in UI_Models.extract_shorelines_button_clicked(self, button) 570 zoo_model_instance = self.get_model_instance() 571 # load in shoreline settings, session directory with model outputs, and a new session name to store extracted shorelines --> 572 zoo_model_instance.extract_shorelines_with_unet( 573 shoreline_settings, 574 session_directory, 575 session_name, 576 shoreline_path, 577 transects_path, 578 )

File ~\anaconda3\envs\coastseg2\lib\site-packages\coastseg\zoo_model.py:850, in Zoo_Model.extract_shorelines_with_unet(self, extract_shoreline_settings, session_path, session_name, shoreline_path, transects_path) 847 # extract shorelines 848 extracted_shorelines = extracted_shoreline.Extracted_Shoreline() 849 extracted_shorelines = ( --> 850 extracted_shorelines.create_extracted_shorlines_from_session( 851 roi_id, 852 shoreline_gdf, 853 roi_settings, 854 extract_shoreline_settings, 855 session_path, 856 new_session_path, 857 ) 858 ) 860 # save extracted shorelines, detection jpgs, configs, model settings files to the session directory 861 common.save_extracted_shorelines(extracted_shorelines, new_session_path)

File ~\anaconda3\envs\coastseg2\lib\site-packages\coastseg\extracted_shoreline.py:1347, in Extracted_Shoreline.create_extracted_shorlines_from_session(self, roi_id, shoreline, roi_settings, settings, session_path, new_session_path) 1338 extracted_shorelines_dict = extract_shorelines_with_dask( 1339 session_path, 1340 metadata, (...) 1344 save_location=new_session_path, 1345 ) 1346 if extracted_shorelines_dict == {}: -> 1347 raise Exception(f"Failed to extract any shorelines.") 1349 logger.info(f"extracted_shoreline_dict: {extracted_shorelines_dict}") 1350 # postprocessing by removing duplicates and removing in inaccurate georeferencing (set threshold to 10 m)

Exception: Failed to extract any shorelines.


Here's my [log file](https://github.com/Doodleverse/CoastSeg/files/11670863/log_06-06-23-03_35_14.txt) and [ROI directory](https://github.com/Doodleverse/CoastSeg/files/11670873/ID_4_datetime05-23-23__02_47_25.zip).
2320sharon commented 1 year ago

Thank you for the feedback Venus. That's strange that even with 91 images it couldn't extract any shorelines. Let me take a look at the log and see if I can figure out what's going wrong.

Also I appreciate you writing everything out so nicely ✨️

2320sharon commented 1 year ago

Looking at the log a common error message is that there wasn't enough sand in the beach buffer. Do you mind sharing the model session directory?

venuswku commented 1 year ago

Thanks for helping me out again with my errors! Here's my model session output: filtered_out_bad_imgs_SEGMENTATION.zip.

2320sharon commented 1 year ago

Well I think the problem you encountered was due to bad segmentations. We do plan on improving the error codes for users so they know why extracted shorelines fail generally. Thank you for testing this and for your feedback!

2320sharon commented 1 year ago

@venuswku Do you think it would be helpful to have the error message say something like: "85% failed to not enough sand in reference shoreline buffer and 15% failed due to too many clouds in imagery" My reasoning is that each of the shorelines could fail to extract for different reasons, but we could summarize why groups of shorelines failed to extract so the user knows whats going on

venuswku commented 1 year ago

Ok, thanks for explaining what might have caused this error. Overall the guide is super clear and informative! I'll try again with more images or a different location to see if the extracted shorelines are better.

venuswku commented 1 year ago

Yeah that error message looks good. It has more context about what's happening so it's easier to understand.

2320sharon commented 1 year ago

Thank you! I think I'll sketch out an idea for how to implement this type of error messaging system. @dbuscombe-usgs what do you think of implementing error messages like these for extract shorelines?

@venuswku Do you think it would be helpful to have the error message say something like: "85% failed to not enough sand in reference shoreline buffer and 15% failed due to too many clouds in imagery" My reasoning is that each of the shorelines could fail to extract for different reasons, but we could summarize why groups of shorelines failed to extract so the user knows whats going on

Depending on the most common error message received some recommendations on how to solve the problem could be provided. For instance, with this one we could recommend the user to check their segmentations then increase the reference shoreline buffer size.

2320sharon commented 1 year ago

I implemented the feedback thanks Venus! https://github.com/Doodleverse/CoastSeg/wiki/5.-How-to-Filter-Out-Bad-Imagery