Closed ferdiu closed 2 years ago
Do you have an example file?
I found out the problem trying to load the file JapaneseVowels_TRAIN.arff
of this dataset
https://timeseriesclassification.com/description.php?Dataset=JapaneseVowels
Sorry, I may misremember the correct name of the file.
Il sab 4 giu 2022, 19:30 Christopher Rowley @.***> ha scritto:
Do you have an example file?
— Reply to this email directly, view it on GitHub https://github.com/cjdoris/ARFFFiles.jl/issues/17#issuecomment-1146655626, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETSK4YSO77KYDK3WYTLSGLVNOHCRANCNFSM5X3VGZZQ . You are receiving this because you authored the thread.Message ID: @.***>
Ok thanks, that looks doable.
Right this works on the main branch now (pkg> add ARFFFiles#main
). Each element of a relational column is read as a table (an ARFFTable
specifically). Let me know if you find any problems with it, and if not I'll make a release.
Thank you for implementing this so quickly.
It seems that load
is doing its job handling relational
attributes but the save
function is complaining about not being able to handle Tables.DictColumnTable
as eltype
for a column:
julia> ARFFFiles.save("test_save.arff", df)
ERROR: ARFF does not support data of type Tables.DictColumnTable in column cepstrum_coefficient
Stacktrace:
[1] save(io::IOStream, df::DataFrame; relation::String, comment::String)
@ ARFFFiles ~/.julia/packages/ARFFFiles/alRHW/src/ARFFFiles.jl:1138
[2] save
@ ~/.julia/packages/ARFFFiles/alRHW/src/ARFFFiles.jl:1091 [inlined]
[3] #25
@ ~/.julia/packages/ARFFFiles/alRHW/src/ARFFFiles.jl:1147 [inlined]
[4] open(::ARFFFiles.var"#25#26"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, DataFrame}, ::String, ::Vararg{String}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ Base ./io.jl:330
[5] open
@ ./io.jl:328 [inlined]
[6] #save#24
@ ~/.julia/packages/ARFFFiles/alRHW/src/ARFFFiles.jl:1147 [inlined]
[7] save(filename::String, df::DataFrame)
@ ARFFFiles ~/.julia/packages/ARFFFiles/alRHW/src/ARFFFiles.jl:1147
[8] top-level scope
@ REPL[13]:1
this is the result of trying to save the same file mentioned earlier after loading it in memory.
As a plus I would probably convert the inner table to the same type the external one is. For instance, calling ARFFFiles.load(DataFrame, "JapaneseVowels_TRAIN.arff")
now returns a two column DataFrame
with the first column of type DictColumnTable
(I think it should be DataFrame
for consistency) and the second of type CategoricalValue{String, UInt32}
.
But I guess this could be just my preference rather than the way to go, it is up to you.
Yeah I didn't implement saving columns of tables as relational yet. It's tricky. I'd strongly recommend not saving data as ARFF anyway, use a more standard format.
I agree that it would be nice to recursively convert the inner tables too, but it breaks the existing API a bit. I'll think about it.
OK load(DataFrame, "some/file.arff")
now recursively converts relational columns.
Nice job. Thank you.
I agree with you saying to save data in a more standard format but I would probably open another issue for it since someone may try to do it in the future.
load
function should be able to load ARFFs containing relational attributes as specified here.