UCSC-MedBook / patient-care

Clincian facing portal showing pathways, signatures, and genes of interest
2 stars 1 forks source link

Surround non-numeric tsv fields with quotes #90

Open rcurrie opened 8 years ago

rcurrie commented 8 years ago

Excel will convert text that appears to be a date unless you surround it with quotes leading to all sorts of shenanigan:

http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7

By default surround all non-numeric fields (in particular gene names) in tsv files that are downloaded (expression levels tables, gene sets...). Ideally also provide a way to not surround them but this should be harder to get to (default checkbox that's on that says 'Format for Excel')

rbaertsch commented 8 years ago

Be careful, I worry that ALL fields are non-numeric. How do we currently represent nulls in numeric fields? Are they strings?

On Aug 26, 2016, at 8:21 AM, Rob Currie notifications@github.com wrote:

Excel will convert text that appears to be a date unless you surround it with quotes leading to all sorts of shenanigan:

http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7 http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7 By default surround all non-numeric fields (in particular gene names) in tsv files that are downloaded (expression levels tables, gene sets...). Ideally also provide a way to not surround them but this should be harder to get to (default checkbox that's on that says 'Format for Excel')

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/UCSC-MedBook/patient-care/issues/90, or mute the thread https://github.com/notifications/unsubscribe-auth/ACKDixemO9PTZr3Oaj0ozNVWjBn0ei2kks5qjwR6gaJpZM4JuL4c.

rcurrie commented 8 years ago

Maybe we just focus on the primary culprits - the gene columns names. If there is anything numeric in a column we leave the entire column as is without quotes.

On Aug 28, 2016, at 12:37 PM, Robert Baertsch notifications@github.com wrote:

Be careful, I worry that ALL fields are non-numeric. How do we currently represent nulls in numeric fields? Are they strings?

On Aug 26, 2016, at 8:21 AM, Rob Currie notifications@github.com wrote:

Excel will convert text that appears to be a date unless you surround it with quotes leading to all sorts of shenanigan:

http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7 http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7 By default surround all non-numeric fields (in particular gene names) in tsv files that are downloaded (expression levels tables, gene sets...). Ideally also provide a way to not surround them but this should be harder to get to (default checkbox that's on that says 'Format for Excel')

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/UCSC-MedBook/patient-care/issues/90, or mute the thread https://github.com/notifications/unsubscribe-auth/ACKDixemO9PTZr3Oaj0ozNVWjBn0ei2kks5qjwR6gaJpZM4JuL4c.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.