Open hpages opened 2 years ago
I'd be affirmative for all 5, but I wonder what "enforce names" in proposal 1 entails? validObject will fail if names are absent, duplicated or NA? Constructor will supply "X", "X.1" and so on when no names present by proposals 2, 5.
I was thinking in mimicking the approach taken by data.frame()
/DataFrame()
, which is to make it hard, but not impossible, to construct an object with no names, or with names that contain ""
, NA
, or duplicates:
df <- data.frame(a=11:14, b=LETTERS[1:4], c=31:34)
names(df) <- NULL
names(df)
# NULL
names(df) <- c("", NA, "")
names(df)
# [1] "" NA ""
As you can see, you can completely get rid of the names, or set names with ""
, NA
, or duplicates, if you really want to. But I was not necessarily thinking in encoding this in the validity method for SummarizedExperiment objects, at least not for now, because I don't know how many serialized SummarizedExperiment derivatives this would break. This is something that can always be done later.
This started as a more general discussion about empty strings in List names but the real concern seems to be more specifically about the names of the assays. It comes down to these basic questions:
Should we enforce names on the assays? Right now assay names are optional:
If the user does not supply assay names, should we make automatic names? (the other option would be to complain in an error message)
Should we enforce their uniqueness? Right now they can have duplicates:
Should we also forbid empty or NA names? Right now they are allowed:
My answer would be "yes" to all 4 questions.
Note that the situation is very similar to what
data.frame()
andDataFrame()
do with column names (whencheck.names=TRUE
). So the last question is:make.names(., unique=TRUE)
likedata.frame()
andDataFrame()
do to fix the user-supplied names?@LTLA @vjcitn @lawremi Comments? Suggesttions?