Closed lwaldron closed 2 years ago
So I guess we just want to ellipsize strings in a showAsCell,character()
method?
I guess so. (Too much low-level technical jargon here.)
In other words, do we want to truncate and append "..."
(an ellipsis) to strings over a certain length within character vectors displayed as a column in a table? Would that suffice to resolve this issue? If so, any suggestions about the length cutoff? Maybe 13, so that 10 are shown with the ellipsis?
The tibble behavior seems a bit strange. The first "evil" column is truncated to only show 4 characters (pretty useless) but there is another column where up to 8 characters are shown without an ellipsis, and there is a truncated string that shows 7.
Hi @lwaldron @lawremi
A while ago (in BioC 3.12) I made the following change in S4Vectors: ccbde78beeaa21d47d3781bc98628a12445c6e3d
With this change:
library(S4Vectors)
labels <- c(A=paste(letters, collapse="_"), B="toto", C="I like tacos")
DataFrame(labels, rev(labels))
# DataFrame with 3 rows and 2 columns
# labels rev.labels.
# <character> <character>
# A a_b_c_d_e_f_g_h_i_j_.. I like tacos
# B toto toto
# C I like tacos a_b_c_d_e_f_g_h_i_j_..
Can't remember why I chose 22 characters as the cutoff value at the time but I can make it smaller if you guys think 22 is too big.
With the evil example above:
library(curatedTCGAData)
mae <- curatedTCGAData("ACC", assays="RPPA*", dry.run=FALSE, version="1.1.38")
DF <- colData(mae)
DF$evil <- apply(DF, 1, function(x) paste(x, collapse="_"))
DF <- DF[, ncol(DF):1]
Then:
> DF[1:7]
DataFrame with 46 rows and 7 columns
evil ADS genome_doublings ploidy
<character> <numeric> <integer> <numeric>
TCGA-OR-A5J2 TCGA-OR-A5J2_44_1_16.. -0.84 0 1.31
TCGA-OR-A5J3 TCGA-OR-A5J3_23_0_NA.. 1.18 0 1.25
TCGA-OR-A5J6 TCGA-OR-A5J6_29_0_NA.. 1.11 1 3.32
TCGA-OR-A5J7 TCGA-OR-A5J7_30_1_49.. 1.03 1 2.56
TCGA-OR-A5J8 TCGA-OR-A5J8_66_1_57.. -3.37 2 5.62
... ... ... ... ...
TCGA-PA-A5YG TCGA-PA-A5YG_51_0_NA.. -0.76 0 1.25
TCGA-PK-A5H8 TCGA-PK-A5H8_42_0_NA.. 0.46 NA NA
TCGA-PK-A5H9 TCGA-PK-A5H9_27_0_NA.. -0.85 0 2.00
TCGA-PK-A5HA TCGA-PK-A5HA_63_0_NA.. -1.49 0 1.69
TCGA-P6-A5OG TCGA-P6-A5OG_45_1_38.. NA NA NA
purity OncoSign COC
<numeric> <character> <character>
TCGA-OR-A5J2 0.89 TP53/NF1 COC3
TCGA-OR-A5J3 0.93 CN2 COC2
TCGA-OR-A5J6 0.68 TERT/ZNRF3 COC1
TCGA-OR-A5J7 0.84 TP53/NF1 COC3
TCGA-OR-A5J8 1.00 CN1 COC2
... ... ... ...
TCGA-PA-A5YG 0.73 TERT/ZNRF3 COC1
TCGA-PK-A5H8 NA NA NA
TCGA-PK-A5H9 0.79 TP53/NF1 COC1
TCGA-PK-A5HA 0.83 CN2 COC1
TCGA-P6-A5OG NA NA NA
Slightly better? OK you still need to manually select a small number of columns to prevent the 823 columns in this DataFrame from flooding your screen but personally I've learned to live with that.
I also like to transpose the display in this situation (you'll need S4Vectors 0.33.9 for this to work properly):
> t(DF)
TransposedDataFrame with 823 rows and 46 columns
TCGA-OR-A5J2 TCGA-OR-A5J3
evil <character> TCGA-OR-A5J2_44_1_16.. TCGA-OR-A5J3_23_0_NA..
ADS <numeric> -0.84 1.18
genome_doublings <integer> 0 0
ploidy <numeric> 1.31 1.25
purity <numeric> 0.89 0.93
... ... ... ...
days_to_last_followup <integer> NA 2091
days_to_death <integer> 1677 NA
vital_status <integer> 1 0
years_to_birth <integer> 44 23
patientID <character> TCGA-OR-A5J2 TCGA-OR-A5J3
TCGA-OR-A5J6 TCGA-OR-A5J7
evil <character> TCGA-OR-A5J6_29_0_NA.. TCGA-OR-A5J7_30_1_49..
ADS <numeric> 1.11 1.03
genome_doublings <integer> 1 1
ploidy <numeric> 3.32 2.56
purity <numeric> 0.68 0.84
... ... ... ...
days_to_last_followup <integer> 2703 NA
days_to_death <integer> NA 490
vital_status <integer> 0 1
years_to_birth <integer> 29 30
patientID <character> TCGA-OR-A5J6 TCGA-OR-A5J7
TCGA-OR-A5J8 TCGA-OR-A5J9
evil <character> TCGA-OR-A5J8_66_1_57.. TCGA-OR-A5J9_22_0_NA..
ADS <numeric> -3.37 0.01
genome_doublings <integer> 2 1
ploidy <numeric> 5.62 2.52
purity <numeric> 1.00 0.84
... ... ... ...
days_to_last_followup <integer> NA 1352
days_to_death <integer> 579 NA
vital_status <integer> 1 0
years_to_birth <integer> 66 22
patientID <character> TCGA-OR-A5J8 TCGA-OR-A5J9
TCGA-OR-A5JA TCGA-OR-A5JP
evil <character> TCGA-OR-A5JA_53_1_92.. TCGA-OR-A5JP_40_0_NA..
ADS <numeric> -0.06 0.70
genome_doublings <integer> 2 1
ploidy <numeric> 5.65 3.00
purity <numeric> 0.75 0.80
... ... ... ...
days_to_last_followup <integer> NA 464
days_to_death <integer> 922 NA
vital_status <integer> 1 0
years_to_birth <integer> 53 40
patientID <character> TCGA-OR-A5JA TCGA-OR-A5JP
TCGA-OR-A5JR TCGA-OR-A5JS
evil <character> TCGA-OR-A5JR_45_0_NA.. TCGA-OR-A5JS_65_0_NA..
ADS <numeric> -0.08 1.27
genome_doublings <integer> 0 1
ploidy <numeric> 1.41 3.46
purity <numeric> 0.88 0.85
... ... ... ...
days_to_last_followup <integer> 3688 383
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 45 65
patientID <character> TCGA-OR-A5JR TCGA-OR-A5JS
TCGA-OR-A5JT TCGA-OR-A5JV
evil <character> TCGA-OR-A5JT_65_0_NA.. TCGA-OR-A5JV_55_0_NA..
ADS <numeric> -1.05 -0.99
genome_doublings <integer> 1 1
ploidy <numeric> 2.67 2.71
purity <numeric> 0.87 0.46
... ... ... ...
days_to_last_followup <integer> 907 2023
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 65 55
patientID <character> TCGA-OR-A5JT TCGA-OR-A5JV
TCGA-OR-A5JW TCGA-OR-A5JY
evil <character> TCGA-OR-A5JW_47_0_NA.. TCGA-OR-A5JY_68_1_55..
ADS <numeric> 1.01 0.12
genome_doublings <integer> 0 NA
ploidy <numeric> 1.57 NA
purity <numeric> 0.89 NA
... ... ... ...
days_to_last_followup <integer> 2202 NA
days_to_death <integer> NA 552
vital_status <integer> 0 1
years_to_birth <integer> 47 68
patientID <character> TCGA-OR-A5JW TCGA-OR-A5JY
TCGA-OR-A5JZ TCGA-OR-A5K0
evil <character> TCGA-OR-A5JZ_60_0_NA.. TCGA-OR-A5K0_69_0_NA..
ADS <numeric> -0.41 0.84
genome_doublings <integer> 0 0
ploidy <numeric> 1.20 1.85
purity <numeric> 0.95 1.00
... ... ... ...
days_to_last_followup <integer> 822 1029
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 60 69
patientID <character> TCGA-OR-A5JZ TCGA-OR-A5K0
TCGA-OR-A5K1 TCGA-OR-A5K3
evil <character> TCGA-OR-A5K1_48_0_NA.. TCGA-OR-A5K3_53_0_NA..
ADS <numeric> -0.13 -0.17
genome_doublings <integer> 0 NA
ploidy <numeric> 1.26 NA
purity <numeric> 0.89 NA
... ... ... ...
days_to_last_followup <integer> 3289 3465
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 48 53
patientID <character> TCGA-OR-A5K1 TCGA-OR-A5K3
TCGA-OR-A5K4 TCGA-OR-A5K5
evil <character> TCGA-OR-A5K4_64_0_NA.. TCGA-OR-A5K5_59_1_49..
ADS <numeric> -0.61 0.61
genome_doublings <integer> 0 1
ploidy <numeric> 1.32 3.28
purity <numeric> 0.97 0.89
... ... ... ...
days_to_last_followup <integer> 1082 NA
days_to_death <integer> NA 498
vital_status <integer> 0 1
years_to_birth <integer> 64 59
patientID <character> TCGA-OR-A5K4 TCGA-OR-A5K5
TCGA-OR-A5K6 TCGA-OR-A5K8
evil <character> TCGA-OR-A5K6_56_0_NA.. TCGA-OR-A5K8_39_0_NA..
ADS <numeric> 1.28 0.31
genome_doublings <integer> 0 0
ploidy <numeric> 1.91 1.51
purity <numeric> 0.91 0.94
... ... ... ...
days_to_last_followup <integer> 1493 749
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 56 39
patientID <character> TCGA-OR-A5K6 TCGA-OR-A5K8
TCGA-OR-A5KO TCGA-OR-A5KU
evil <character> TCGA-OR-A5KO_39_0_NA.. TCGA-OR-A5KU_37_0_NA..
ADS <numeric> 0.46 -0.10
genome_doublings <integer> 0 1
ploidy <numeric> 1.51 2.72
purity <numeric> 0.96 0.82
... ... ... ...
days_to_last_followup <integer> 1414 4673
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 39 37
patientID <character> TCGA-OR-A5KO TCGA-OR-A5KU
TCGA-OR-A5KW TCGA-OR-A5KX
evil <character> TCGA-OR-A5KW_55_0_NA.. TCGA-OR-A5KX_25_0_NA..
ADS <numeric> 0.46 -0.33
genome_doublings <integer> 0 1
ploidy <numeric> 1.40 4.20
purity <numeric> 0.95 0.94
... ... ... ...
days_to_last_followup <integer> 2076 1364
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 55 25
patientID <character> TCGA-OR-A5KW TCGA-OR-A5KX
TCGA-OR-A5KY TCGA-OR-A5KZ
evil <character> TCGA-OR-A5KY_23_1_39.. TCGA-OR-A5KZ_42_1_12..
ADS <numeric> 0.25 -0.89
genome_doublings <integer> 1 1
ploidy <numeric> 2.39 2.31
purity <numeric> 0.73 0.94
... ... ... ...
days_to_last_followup <integer> NA NA
days_to_death <integer> 391 125
vital_status <integer> 1 1
years_to_birth <integer> 23 42
patientID <character> TCGA-OR-A5KY TCGA-OR-A5KZ
TCGA-OR-A5LD TCGA-OR-A5LG
evil <character> TCGA-OR-A5LD_52_1_11.. TCGA-OR-A5LG_46_0_NA..
ADS <numeric> -0.26 1.28
genome_doublings <integer> 1 0
ploidy <numeric> 3.19 1.81
purity <numeric> 0.96 0.92
... ... ... ...
days_to_last_followup <integer> NA 1589
days_to_death <integer> 1197 NA
vital_status <integer> 1 0
years_to_birth <integer> 52 46
patientID <character> TCGA-OR-A5LD TCGA-OR-A5LG
TCGA-OR-A5LH TCGA-OR-A5LJ
evil <character> TCGA-OR-A5LH_36_1_23.. TCGA-OR-A5LJ_54_1_11..
ADS <numeric> 0.59 -0.68
genome_doublings <integer> 0 0
ploidy <numeric> 1.28 1.38
purity <numeric> 0.85 0.93
... ... ... ...
days_to_last_followup <integer> NA NA
days_to_death <integer> 2385 1105
vital_status <integer> 1 1
years_to_birth <integer> 36 54
patientID <character> TCGA-OR-A5LH TCGA-OR-A5LJ
TCGA-OR-A5LK TCGA-OR-A5LL
evil <character> TCGA-OR-A5LK_62_0_NA.. TCGA-OR-A5LL_75_1_16..
ADS <numeric> -2.13 1.17
genome_doublings <integer> 1 1
ploidy <numeric> 2.74 3.78
purity <numeric> 0.58 0.76
... ... ... ...
days_to_last_followup <integer> 2740 NA
days_to_death <integer> NA 1613
vital_status <integer> 0 1
years_to_birth <integer> 62 75
patientID <character> TCGA-OR-A5LK TCGA-OR-A5LL
TCGA-OR-A5LM TCGA-OR-A5LN
evil <character> TCGA-OR-A5LM_23_0_NA.. TCGA-OR-A5LN_31_0_NA..
ADS <numeric> 0.45 -0.29
genome_doublings <integer> NA 1
ploidy <numeric> NA 3.12
purity <numeric> NA 0.70
... ... ... ...
days_to_last_followup <integer> 1858 2342
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 23 31
patientID <character> TCGA-OR-A5LM TCGA-OR-A5LN
TCGA-OR-A5LO TCGA-OR-A5LP
evil <character> TCGA-OR-A5LO_61_1_24.. TCGA-OR-A5LP_37_0_NA..
ADS <numeric> -0.41 -0.66
genome_doublings <integer> 1 0
ploidy <numeric> 2.72 1.28
purity <numeric> 0.88 0.76
... ... ... ...
days_to_last_followup <integer> NA 1857
days_to_death <integer> 2405 NA
vital_status <integer> 1 0
years_to_birth <integer> 61 37
patientID <character> TCGA-OR-A5LO TCGA-OR-A5LP
TCGA-OR-A5LS TCGA-OR-A5LT
evil <character> TCGA-OR-A5LS_34_0_NA.. TCGA-OR-A5LT_57_0_NA..
ADS <numeric> 0.48 -1.46
genome_doublings <integer> 1 1
ploidy <numeric> 2.99 2.78
purity <numeric> 0.84 0.92
... ... ... ...
days_to_last_followup <integer> 1096 549
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 34 57
patientID <character> TCGA-OR-A5LS TCGA-OR-A5LT
TCGA-OU-A5PI TCGA-PA-A5YG
evil <character> TCGA-OU-A5PI_53_0_NA.. TCGA-PA-A5YG_51_0_NA..
ADS <numeric> 0.25 -0.76
genome_doublings <integer> 0 0
ploidy <numeric> 1.57 1.25
purity <numeric> 0.89 0.73
... ... ... ...
days_to_last_followup <integer> 1171 756
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 53 51
patientID <character> TCGA-OU-A5PI TCGA-PA-A5YG
TCGA-PK-A5H8 TCGA-PK-A5H9
evil <character> TCGA-PK-A5H8_42_0_NA.. TCGA-PK-A5H9_27_0_NA..
ADS <numeric> 0.46 -0.85
genome_doublings <integer> NA 0
ploidy <numeric> NA 2.00
purity <numeric> NA 0.79
... ... ... ...
days_to_last_followup <integer> 3623 616
days_to_death <integer> NA NA
vital_status <integer> 0 0
years_to_birth <integer> 42 27
patientID <character> TCGA-PK-A5H8 TCGA-PK-A5H9
TCGA-PK-A5HA TCGA-P6-A5OG
evil <character> TCGA-PK-A5HA_63_0_NA.. TCGA-P6-A5OG_45_1_38..
ADS <numeric> -1.49 NA
genome_doublings <integer> 0 NA
ploidy <numeric> 1.69 NA
purity <numeric> 0.83 NA
... ... ... ...
days_to_last_followup <integer> 1201 NA
days_to_death <integer> NA 383
vital_status <integer> 0 1
years_to_birth <integer> 63 45
patientID <character> TCGA-PK-A5HA TCGA-P6-A5OG
Looks more neat than the untransposed display.
H.
For example:
It might be good to copy the tibble show method:
In general, the show method for
DataFrame
becomes unseemly when there are long character elements. For an even tougher example, trycolData(curatedTCGAData::curatedTCGAData("ACC", assays="RPPA*", dry.run=FALSE))
. tibble seems to handle even this evilshow
example well: