VEuPathDB / EdaSubsettingService

A REST service to provide data and subsetting in the Exploratory Data Analysis Workspace
Apache License 2.0
0 stars 0 forks source link

Remove time from date variables with no times in subset modal download files #56

Closed SheenaTomko closed 2 years ago

SheenaTomko commented 2 years ago

In the download files made by the modal, the user can download dates. Usually (maybe always), these dates have no time associated, but the date appears as 2017-03-14T00:00:00 (ex LLINEUP "Observation date" variable). When there are no times entered, we want to strip out "T00:00:00" from the values.

ryanrdoherty commented 2 years ago

Conversation with Danica regarding implementation:

Ryan Doherty 4:50 PM Are these the subset downloads (not raw files)?

Danica Helb 4:50 PM yes, this ticket is specific for subset files. i don’t know if it is also an issue with the bulk (raw) files

Ryan Doherty 4:50 PM How can we determine whether to trim the time from a particular variable?

Danica Helb 4:51 PM if the time == 00:00:00 then you should trim (which is the vast majority of cases) 4:51 is that enough for you to implement this?

Ryan Doherty 4:56 PM Across all variables? 4:57 I think so, yes.

Danica Helb 4:57 PM yeah, if all values for any given variable have 00:00:00 then it isn’t providing any info andd should be stripped out

Ryan Doherty 4:58 PM So, if a var has 2022-01-01T12:00:00 for one row, but 2022-01-01T00:00:00, you want to trim from the second row but not the first, even if same var on same entity? 4:58 I'm asking because it seems to me that we would want a consistent format for a single column of data.

Danica Helb 4:58 PM in that case, I would leave both. 4:58 yes you are correct, we want to treat all values for a given variable consistently

Ryan Doherty 4:59 PM Right, but we can't read the entire dataset to check if all the values for a given var have 00:00:00 before we decided to trim them off. 4:59 I mean, we can, but it's a pretty big performance hit- double the time really. 4:59 Also a lot more work.

Danica Helb 4:59 PM ah, i see

Ryan Doherty 5:00 PM Would be better to have a flag on the var like "timeOnly" or similar

Danica Helb 5:00 PM ok, let me check in with the outreach team. i think only a minority of variables will have time associated with them

Ryan Doherty 5:00 PM Adding that means a reload of all studies + code work in the subsetting service to read the new metadata

Danica Helb 5:01 PM yeah if we go that direction we won’t do it for b56 :blush:

nkittur-uga commented 2 years ago

Yup, times have been stripped. QA is complete.