mc2-center / csbc-pson-dcc

Data coordination resources for the NCI CSBC and PS-ON consortia
1 stars 4 forks source link

"|" separated not being parsed correctly by portal (in grantName) #73

Closed bswhite closed 4 years ago

bswhite commented 4 years ago

This dataset (alias = PRJNA352075) https://staging.csbc-pson.synapse.org/Explore/Datasets?QueryWrapper0=%7B%22sql%22%3A%22SELECT%20*%5Cn%20%20FROM%20syn21897968%5Cn%20%20WHERE%20(%60datasetAlias%60%20LIKE%20%27%25PRJNA352075%25%27)%22%2C%22limit%22%3A25%2C%22offset%22%3A0%7D has two grants associated with it, separated by a "|".

Columbia University Center for Topology of Cancer Evolution and Heterogeneity | Dana-Farber Cancer Institute Physical Sciences-Oncology Center

The portal does not correctly recognize this as two grants -- as you can see by mousing over the grant Name or by clicking on it, which hangs in trying to bring up a grant details page.

jaeddy commented 4 years ago

@bswhite do you know if there's another symbol that the portals recognize as a delimiter? Comma isn't an option, as there are multiple grants that have a comma in the name (note: the opposite problem is true for publications, where titles are being split for a single pub).

Another option would be to convert to an actual STRING_LIST column, but I'm a little worried that might blow up Synapse with such long names. I can try it out on a copy of the table this afternoon.

jaeddy commented 4 years ago

Whoops. Didn't mean to close!

bswhite commented 4 years ago

@jaeddy -- I don't know. I'm guessing that portals only recognized "," as a delimiter. But we could probably change that with Michael. I don't understand how STRING_LIST is implemented -- so I can't comment on the "blowing up."

"|" does seem like a very reasonable delimiter though -- I don't think I've seen that in any fields of interest.

I don't think this is urgent. I just wanted to make sure it didn't get lost.

jaeddy commented 4 years ago

Grant info has been converted to STRING_LIST for projects, datasets, tools, and publications tables — I think this should resolve the parsing issue.