Open apjanke opened 6 months ago
The implementation of string
class you merged in main branch is missing the <missing>
values as it is
.
As a result
>> sz = [4 3];
>> varTypes = {'double','datetime','string'};
>> T = table('Size',sz,'VariableTypes',varTypes)
T =
table: 4 rows x 3 variables
VariableNames: Var1, Var2, Var3
>> prettyprint(T)
----------------------
| Var1 | Var2 | Var3 |
----------------------
| 0 | NaT | |
| 0 | NaT | |
| 0 | NaT | |
| 0 | NaT | |
----------------------
instead of
>> prettyprint(T)
---------------------------
| Var1 | Var2 | Var3 |
---------------------------
| 0 | NaT | <missing> |
| 0 | NaT | <missing> |
| 0 | NaT | <missing> |
| 0 | NaT | <missing> |
---------------------------
I really think you should reconsider adopting my approach on this issue.
I don't think my string is missing missing values. It's just that string({''})
is not how you create them. You need to use string(missing)
or NaS
instead. (string(missing)
is the Matlab way of doing it; NaS
is my own extension.)
There was a bug in string.missing
that caused it and NaS
to return ""
instead of missings; find a fix here: https://github.com/apjanke/octave-tablicious/commit/92602102a6913b13b4dc091e4e02bb2df9ab16d9.
I agree there's a case for making a "preallocated" table like you're referring to, including one populated with missings. I just don't think a change to the string
constructor semantics is required for that.
I took a stab at my own implementation of the table "preallocation" constructor over here on branch WIP/preallocation-ctor-form. Have a look and see what you think.
One thing is that Matlab's table()
constructor, for the Size/VariableTypes preallocation form, defines the fill values for variables differently from array expansion, and they're not the missings you're asking for here. Here's the doco in the table()
help page. In particular, the fill value used for string
is ""
, not string(missing)
. And the default for double is 0, not NaN.
For Matlab compatibility, I made my table('Size', 'VarTypes')
constructor match Matlab's fill behavior.
I think there's a case for making preallocated tables that use NaN and string(missing)
as fill values instead of 0 and "". I think that way actually makes more sense, and is less error-prone. To support that, Tablicious could provide a static method or extension to the constructor arguments to support that. My implementation on that branch has the code to do so, but it's not part of the public interface yet; it's in tblish.table.internal.blankTable
. (Packages named +internal
are by convention (at least in Matlab) internal-use stuff that's not part of the public API of a piece of code, even though they have public access at the language level.)
The main string class doco needs some work. The main intro sentence, "A string array of Unicode strings.", is really clumsy, the whole thing focuses too much on Unicode, and it's not clear who the audience is. (Like, I'm basically talking to coders who already understand basic computer text representation and the standard Octave text types.)
Update it.
This is expanded from https://github.com/apjanke/octave-tablicious/issues/110.