xKDR / TSFrames.jl

Timeseries in Julia
MIT License
90 stars 22 forks source link

Various comments #128

Open bkamins opened 1 year ago

bkamins commented 1 year ago

The index is stored in the Index column of coredata.

x-link to Index is missing

If coredata only has a single column then a new sequential index is generated.

  1. What if it has no columns?
  2. What if I want this single column already has name Index?

In general mention in the description the second argument in the constructor.

Mention that any Tables.jl table is also accepted.

https://github.com/xKDR/TSFrames.jl/blob/c2078d07a7b73d8d342ba9a14aa850dea25bca9f/src/apply.jl#L149

What if index column is not called :Index? (the same with other places - the docs above said that the index column does not have to be called Index, but maybe I misunderstood)

https://github.com/xKDR/TSFrames.jl/blob/c2078d07a7b73d8d342ba9a14aa850dea25bca9f/src/diff.jl#L79

I would throw ArgumentError for easier error handling (also in other places)

https://github.com/xKDR/TSFrames.jl/blob/c2078d07a7b73d8d342ba9a14aa850dea25bca9f/src/endpoints.jl#L321

Why not just Period (this would allow for supporting user extensions of special date/time types)

j::AbstractVector{T}) where {T<:Union{String, Symbol}}

This is minor, but it is incorrect as it allows:

julia> Union{String, Symbol}[:s, "s"]
2-element Vector{Union{String, Symbol}}:
 :s
 "s"

Indexing with string

  1. You allow only String, while it would be better to allow for AbstractString
  2. You do not allow for indexing with vector of Bool (neither rows nor columns) - maybe this is intended
  3. You do not allow special selectors like Between, Not etc. (maybe this is intended?)

return (f == :coredata) ? getfield(ts, :coredata) : getproperty(ts.coredata, f)

This is incorrect. coredata can have :coredata column name. fields and properties should not be mixed. I would use a function to expose :coredata field instead. Eg. getdataframe(tsf::TSFrame) = getfield(tsf, :coredata)

Base.join

People did not like extending this function in packages. Better add methods for innerjoin etc. You might consider allowing passing kwargs like renamecols.

DataFrames.Matrix(ts.coredata[!, Not(:Index)])

DataFrames prefix is not needed

general question (maybe I missed something)

Maybe you want TSFrame to be iterable? I.e. have length and support something like:

for row in tsf

here have DataFrameRow with data

end

general question (maybe I missed something)

How do you want to support panel data (so essentially TSFrame grouped by some column)

chiraganand commented 1 year ago

Thanks @bkamins for the very useful comments. I have responded below.

The index is stored in the Index column of coredata.

x-link to Index is missing

Sorry, did not understand this. What is the purpose of cross linking?

If coredata only has a single column then a new sequential index is generated.

1. What if it has no columns?

There is a constructor for creating an empty TSFrame object but currently it does not work with a DataFrame without any columns. Ideally, TSFrame(DataFrame()) should create a TSFrame with an Index column and 0 other non-index columns and 0 rows.

2. What if I want this single column already has name `Index`?

Right now, TSFrame(DataFrame(Index=[1, 2])) throws an error because a DataFrame cannot be constructed with two Index column names. This only happens when the DataFrame has only one column named Index. In other cases, the column named Index is taken as the index column of TSFrame, this is the intended behaviour. The single column DataFrame implementation will have to be fixed.

In general mention in the description the second argument in the constructor. Mention that any Tables.jl table is also accepted.

Yes, to both.

https://github.com/xKDR/TSFrames.jl/blob/c2078d07a7b73d8d342ba9a14aa850dea25bca9f/src/apply.jl#L149

What if index column is not called :Index? (the same with other places - the docs above said that the index column does not have to be called Index, but maybe I misunderstood)

It means that the original DataFrame does not need to have a column named Index because the TSFrame index can be constructed using an existing column by specifying the column index (Int) or the name (Symbol/String).

If the column number or name isn't specified in the constructor then Index named column is looked up and if that doesn't exist then Index is generated.

https://github.com/xKDR/TSFrames.jl/blob/c2078d07a7b73d8d342ba9a14aa850dea25bca9f/src/diff.jl#L79

I would throw ArgumentError for easier error handling (also in other places)

Yes.

https://github.com/xKDR/TSFrames.jl/blob/c2078d07a7b73d8d342ba9a14aa850dea25bca9f/src/endpoints.jl#L321

Why not just Period (this would allow for supporting user extensions of special date/time types)

Yes, of course. Got missed.

j::AbstractVector{T}) where {T<:Union{String, Symbol}}

This is minor, but it is incorrect as it allows:

julia> Union{String, Symbol}[:s, "s"]
2-element Vector{Union{String, Symbol}}:
 :s
 "s"

Thanks, did not know about this.

Indexing with string

1. You allow only `String`, while it would be better to allow for `AbstractString`

Yes.

2. You do not allow for indexing with vector of `Bool` (neither rows nor columns) - maybe this is intended
3. You do not allow special selectors like `Between`, `Not` etc. (maybe this is intended?)

Not that both of these are not intended but were lower priority earlier. I believe these would be good enhancements.

return (f == :coredata) ? getfield(ts, :coredata) : getproperty(ts.coredata, f)

This is incorrect. coredata can have :coredata column name. fields and properties should not be mixed. I would use a function to expose :coredata field instead. Eg. getdataframe(tsf::TSFrame) = getfield(tsf, :coredata)

Alright, will fix this.

Base.join

People did not like extending this function in packages. Better add methods for innerjoin etc. You might consider allowing passing kwargs like renamecols.

This is done to keep the API simple with a single join method supporting all kinds of joins. R zoo/xts follow this convention.

DataFrames.Matrix(ts.coredata[!, Not(:Index)])

DataFrames prefix is not needed

Okay.

general question (maybe I missed something)

Maybe you want TSFrame to be iterable? I.e. have length and support something like:

for row in tsf # here have DataFrameRow with data end

This is possible with Tables.jl implementation: for row in Tables.rows(tsf)

general question (maybe I missed something)

How do you want to support panel data (so essentially TSFrame grouped by some column)

There hasn't been much thought about taking this path actually. Not sure if TSFrames should handle panel data use cases or another package which uses composition with TSFrame?

bkamins commented 1 year ago

What is the purpose of cross linking?

So that in the generated documentation you can click on Index and go to its definition.

chiraganand commented 2 months ago

The list of issues to track some of your comments:

  1. https://github.com/xKDR/TSFrames.jl/issues/182
  2. https://github.com/xKDR/TSFrames.jl/issues/183
  3. https://github.com/xKDR/TSFrames.jl/issues/184
  4. https://github.com/xKDR/TSFrames.jl/issues/185
  5. https://github.com/xKDR/TSFrames.jl/issues/186
  6. https://github.com/xKDR/TSFrames.jl/issues/187
  7. https://github.com/xKDR/TSFrames.jl/issues/188
  8. https://github.com/xKDR/TSFrames.jl/issues/189
  9. https://github.com/xKDR/TSFrames.jl/issues/190
  10. https://github.com/xKDR/TSFrames.jl/issues/191
  11. https://github.com/xKDR/TSFrames.jl/issues/192
  12. https://github.com/xKDR/TSFrames.jl/issues/194