Imageomics / dashboard-prototype

Prototype data dashboard for Imageomics Data
http://dash.imageomics.org
MIT License
5 stars 2 forks source link

Bug in the dropdown selector #63

Closed egrace479 closed 4 months ago

egrace479 commented 4 months ago

I also noticed a bug in the dropdown selector: if you go from specifying subspecies to selecting Any-<species> for the subspecies, then it will not recognize the Any, as it is returned at this stage as a list (['Any']) instead of a string ('Any'), which is the initial return. This is a result of the check for type that was inserted (at line 142 in the updated components/query.py) to address instances of people not removing the "Any" condition when selecting more subspecies (assuming they'd want the specified options). This is something I intend to fix and update in another PR to dev for further testing before updating main and generating a new release.

Originally posted by @egrace479 in https://github.com/Imageomics/dashboard-prototype/issues/62#issuecomment-2077725325

egrace479 commented 4 months ago

Current set-up: Screenshot 2024-04-26 at 1 15 18 PM I had been trying to reduce the number of if-statements, as there are many cases (the first return without selecting something in subspecies is a string, but then it is a list for all returns once a selection has been made):

if ("Any" in subspecies and type(subspecies) == str) or ("Any" in subspecies[0] and len(subspecies) == 1):
    if type(subspecies) == list:
        subspecies = subspecies[0]
    if subspecies == 'Any':
        df_sub = df.copy()
    else:
        species = subspecies.split('-')[1] # will be case that's input
        df_sub = df.loc[df.Species == species].copy()
else:
    df_sub = df.loc[df.Subspecies.isin(subspecies)].copy()

Since the only options that return as a string are "Any" and "Any-<subspecies>", this should catch the list to check for "Any", while preserving list-values for the .isin on the else statement after

egrace479 commented 4 months ago

We also had instances where the subspecies value was null (filled with unknown) for multiple species. This resulted images for different species being displayed based on that unknown selection.

Two potential options to address this:

  1. Call get_species_options prior to filling null values and drop them as options. The images could still be accessed when looking at any options of the selected species (or any sample images across any species).
  2. Add a check for null subspecies values into the get_species_options and add something like unknown ssp. of <species> option to the subspecies options list:
    for species in species_list:
        temp = df.loc[df.Species == species]
        subspecies_list = temp['Subspecies'].dropna().unique()
        if temp.loc[temp["Subspecies"].isna()].shape[0] > 0:
            # if there are null values of Subspecies for given species, record as "unknown-<species>" option
            subspecies_list.append("unknown " + species)
        subspecies_list = np.insert(subspecies_list, 0 , 'Any-' + species) # need this to match as filled for img selection
        all_species[species] = list(subspecies_list)

    In the get_filenames function, there would then be a restriction to just that species and a search for unknown among the subspecies.

Option 2 feels a bit much into the weeds considering the desire to generalize this. I was also re-thinking the wisdom of using unknown to fill nulls (instead of, eg., not-provided), as unknown implies lack of knowledge (and sometimes appears) as opposed to simply a lack of availability of the information.

egrace479 commented 4 months ago

Resolved in PR #65