VEuPathDB / EdaSubsettingService

A REST service to provide data and subsetting in the Exploratory Data Analysis Workspace
Apache License 2.0
0 stars 0 forks source link

Subsetting results table: Remove preview in download modal if none available? #78

Closed asizemore closed 1 year ago

asizemore commented 2 years ago

From slack discussion in microbiome channel

(Jay) Is this a known issue? The download preview is missing for most studies. I think the cause is the missing "wide" table for assays. (Go to Download, click through the error popup, select something from the Assay variables: no preview) Note this does not affect the download file, you get everything selected. The wide table is created per study per entity (EDA.Attributes[studyName][entityType]) and has one column for each variable. In Oracle there is a hard limit of 1000 columns. It is not possible to increase this limit. We cannot generate the wide table for assays because most of them would have over 1000 columns. There are 3 studies that have fewer than 1000 assay variables: Uganda Maternal, Resistome, St Louis NICU: for these you will get a preview. (Dan) ... I think it's OK, most users will just want to download and will OK with not previewing. If there were an easy fix, I'd say go for it, but that doesn't sound like the case (Jay) I guess UI developers will have to either remove the preview or get the data from another table -- EDA.AttributeValue[studyName][entityType] -- which is probably what is used to make the download file

Would a reasonable solution for the upcoming release just be to show No Preview Available for the assays?

dmfalke commented 2 years ago

We can add some code in the front end to disable the button if the entity has > 1000 variables. Would that be a reasonable approach, or is there something more nuanced going on? I am guessing this would be a short term fix.

asizemore commented 2 years ago

@jaycolin is the 1000 column limit something we can expand in the future? Curious how short term of a fix the frontend strategy would be

ryanrdoherty commented 2 years ago

For the "wide table", I don't think we actually read real columns for those vars- we read them out of the JSON clob. So maybe we could just read the JSON clob in its entirety and get the vars out of there that we want. Would be expensive (runtime) because we'd be pulling over ALL the vars to pick out some, but if "some" is >1000 maybe worth it?

This is if I'm interpreting correctly that the 1000 column limit is being imposed on the "select", not on any real DB table itself.

dmfalke commented 2 years ago

@ryanrdoherty I think the error is coming from the tabular endpoint in EdaSubsettingService, fwiw.

asizemore commented 2 years ago

So do i understand correctly the options are currently

  1. Read all the vars to pick a few. Expensive but would show the table as expected.
  2. Add frontend check on number of variables. If its >1000 disable the table. Fast but no table.

Is that about right?

danicahelb commented 1 year ago

From the ClinEpi perspective, we would definitely want option 1 (Read all the vars to pick a few. Expensive but would show the table as expected). It would be fine to disable the ability to add >1000 variables to the table (if the user wanted to download this many variables, they would be better served by downloading the full dataset). But I can definitely see use cases where an entity has >1000 variables in it and the user is only interested in downloading data from 10-100 of the terms.

dmfalke commented 1 year ago

I moved this to EdaSubsettingService, since it sounds like a solution there is preferred. We can move it to web-eda, if the frontend solution becomes the preferred solution.

danicahelb commented 1 year ago

related tickets: https://github.com/VEuPathDB/EdaDataService/issues/226 https://github.com/VEuPathDB/EdaSubsettingService/issues/81

ryanrdoherty commented 1 year ago

Here's a nice description of the issue. When we download, the select of >1000 rows is ok but when we add pagination, Oracle creates a view of the select for sorting and rownum production and that's when we hit our limit. https://stackoverflow.com/questions/13314868/oracle-limit-and-1000-column-restriction

danicahelb commented 1 year ago

Oh! so the issue is with the number of rows (ie, observations) and NOT the number of columns (ie, variables) Thanks

but... @ryanrdoherty the large number of rows isn't an issue for clinepi. This results table has >60K rows!

image
danicahelb commented 1 year ago

Nevermind, i see in the stack overflow the issue is actually with the columns

I still don't understand though... if I am only selecting 5 columns to view, why doesn't the table render?

ryanrdoherty commented 1 year ago

The solution we decided on (throw 400 in these limited cases in the subsetting service, client UX TBD), is implemented here: https://github.com/VEuPathDB/EdaSubsettingService/commit/7dd748a65c7f118453f012fc043cf605a3e64630

ryanrdoherty commented 1 year ago

Leaving the rest of this implementation to the client.

asizemore commented 1 year ago

Thanks @ryanrdoherty !

danicahelb commented 1 year ago

@ryanrdoherty @asizemore I still see the same server-error message. Do I need to make a frontend ticket to put up a nice warning message instead of this error?

Image

asizemore commented 1 year ago

i think a ticket already exists. The PR for the frontend is here

danicahelb commented 1 year ago

the ticket for the front end message is here: https://github.com/VEuPathDB/web-eda/issues/1652

danicahelb commented 1 year ago

this is now fixed

Image