Isaiahensley / Aquatic-RIG

Our Senior Capstone project focuses on developing a Streamlit website dedicated to the visualization of aquatic NetCDF datasets. Aquatic data is inherently complex, being both spatiotemporal—capturing information over different times and space. Our website will let users alternate through time and depth to give comprehensive visuals for their data.
https://aquaticrig-develop.streamlit.app/
0 stars 1 forks source link

Adding preloaded .nc files for feedback testing #79

Closed Isaiahensley closed 5 months ago

Isaiahensley commented 6 months ago

Description: Currently, users must upload .nc files to use the main functionality of our website. While this makes sense, we would like to have preexisting files already present on the website as an option for testing and to show an example of how our website works. This will not only be useful for users to get an idea of how our website manipulates these files but also a way for users to give feedback.

Expected Outcome: When first loading the dataset management page, the user will have two options (These names may be changed later on) 1) Upload NetCDF4 Files 2) Use Example Files

If "Use Example Files" the website will simply use the files already uploaded in our website and skip the upload files screen.

Additional context Relevant functionality is being worked on in the Dataset Managment Page UI issue. It may be beneficial to have the work completed on that issue before working on this one.

Isaiahensley commented 6 months ago

Description:

Did not make much progress but I messed around with different widgets and layouts for our example dataset option. I think we should use a checkbox. If it is selected, it should ignore any files that have been uploaded and allow the user to press next. It will use files in a folder named example_dataset. This can be placed in our github repository so streamlit sharing can use them.


Screenshot Snippets:

Image

Isaiahensley commented 6 months ago

Description:

Preloaded .nc files have been added. I've added an example_dataset folder with 5 .nc files users can use if they don't wish to upload files. If a user either uploads files or checks the "Use Example Dataset" checkbox, the user can press the Next button and continue to the next page. However, if both a file is uploaded and the checkbox is checked, it will ignore the files that have been uploaded. I had to make SEVERAL changes to all of the code to make this work. Uploaded files are treated a lot differently than files stored locally. I've also gone through and added comments to help understand the code.


Screenshot Snippets:

image


Code Snippets:

Loads 5 NetCDF4 files from the example_dataset folder

def load_example_dataset():
    example_files = []
    example_dataset_folder = 'example_dataset'
    for filename in os.listdir(example_dataset_folder):
        if filename.endswith('.nc'):
            file_path = os.path.join(example_dataset_folder, filename)
            with open(file_path, 'rb') as f:
                bytes_io = BytesIO(f.read())
                bytes_io.name = filename  # Set the name attribute to the filename
                example_files.append(bytes_io)
    return example_files

Changed to accept files that are not uploaded and treat them appropriately

def extract_file_data(files_upload):
    # Initialize defaults
    datetime_to_file_map = defaultdict(list)
    all_datetime_strings = []
    variables_not_dimensions = []
    depth_levels = None

    # Extracts info from the uploaded nc files
    if files_upload:
        all_datetime_strings = []
        variables_not_dimensions = []

        for uploaded_file in files_upload:
            nc_file = nc.Dataset('in-memory', memory=uploaded_file.getvalue(), diskless=True)

            # Extract depth information from the first file
            if 'depth' in nc_file.dimensions:
                depth_dim = nc_file.dimensions['depth']
                depth_levels = len(depth_dim)  # Store the number of depth levels

            time_var = nc_file.variables['time']
            time_units = time_var.units
            datetimes = nc.num2date(time_var[:], units=time_units)

            for dt in datetimes:
                dt_str = str(dt)
                all_datetime_strings.append(dt_str)
                datetime_to_file_map[dt_str].append(uploaded_file.name)

            # Identify variables that are not dimensions
            if not variables_not_dimensions:  # If not already determined
                for var_name, variable in nc_file.variables.items():
                    if var_name not in nc_file.dimensions:
                        variables_not_dimensions.append(var_name)

            nc_file.close()
    return all_datetime_strings, depth_levels, variables_not_dimensions, datetime_to_file_map

Changed the initial page to let the user pick either uploading or using the example dataset. Also creating session states for the file variables so they are brought over to each page for further computation.

def dataset_management_page():
    initialize_state()

    # ------------------------
    # Step 0: File Upload Page
    # ------------------------

    if st.session_state['current_step'] == 0:
        st.session_state['files_upload'] = file_uploader()
        st.write("")
        st.session_state['example_dataset'] = st.checkbox("Use Example Dataset")
        st.write("")
        next_button(st.session_state['files_upload'] or st.session_state['example_dataset'])
        # Proceed after next button is pressed if files are uploaded or example dataset is checked

    # -----------------------------------
    # Step 1: Visualization Selection Page
    # -----------------------------------

    if st.session_state['current_step'] == 1:

        # Decide which files to use: uploaded files or example dataset files
        # If both files are uploaded and the example dataset is checked it will ignore the uploaded files
        files_to_process = None
        if st.session_state['example_dataset']:
            # Load example dataset files
            files_to_process = load_example_dataset()
        elif st.session_state['files_upload']:
            # Use the files uploaded by the user
            files_to_process = st.session_state['files_upload']

        # Extract data from either uploaded or example dataset files
        if files_to_process:
            (st.session_state['all_datetime_strings'],
             st.session_state['depth_levels'],
             st.session_state['variables_not_dimensions'],
             st.session_state['datetime_to_file_map']) = extract_file_data(files_to_process)
             ...
spaude11 commented 5 months ago

Are we not done with this issue?

Isaiahensley commented 5 months ago

@spaude11 Yes, we're done. I pushed the code into my new feature branch and we can review it and move it over to develop branch soon.

spaude11 commented 5 months ago

I will close the issue.