Closed ancapdev closed 6 years ago
Merging #361 into master will decrease coverage by
0.02%
. The diff coverage is100%
.
@@ Coverage Diff @@
## master #361 +/- ##
==========================================
- Coverage 86.22% 86.19% -0.03%
==========================================
Files 10 10
Lines 479 478 -1
==========================================
- Hits 413 412 -1
Misses 66 66
Impacted Files | Coverage Δ | |
---|---|---|
src/timearray.jl | 98.48% <100%> (-0.02%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update d462cd5...0e2383b. Read the comment docs.
@iblis17 could you review this PR ? Thanks
Could you add some test cases?
Seem it can be faster, could you try this patch out?
--- a/src/timearray.jl
+++ b/src/timearray.jl
@@ -23,19 +23,23 @@ struct TimeArray{T, N, D <: TimeType, A <: AbstractArray{T, N}} <: AbstractTimeS
timestamp::AbstractVector{D},
values::A,
colnames::Vector{String},
- meta::Any;
- unchecked = false) where {T, N, D <: TimeType, A <: AbstractArray{T, N}}
+ meta::Any,
+ unchecked::Bool) where {T, N, D <: TimeType, A <: AbstractArray{T, N}}
nrow, ncol = size(values, 1, 2)
@@ -46,12 +50,12 @@ end
TimeArray(d::AbstractVector{D}, v::AbstractArray{T, N},
c::Vector{S}=fill("", size(v,2)),
m::Any=nothing;
- args...) where {T, N, D <: TimeType, S <: AbstractString} =
- TimeArray{T, N, D, typeof(v)}(d, v, map(String, c), m; args...)
+ unchecked=false) where {T, N, D <: TimeType, S <: AbstractString} =
+ TimeArray{T, N, D, typeof(v)}(d, v, map(String, c), m, unchecked)
TimeArray(d::D, v::AbstractArray{T, N}, c::Vector{S}=fill("", size(v, 2)),
m::Any=nothing;
- args...) where {T, N, D <: TimeType, S <: AbstractString} =
- TimeArray{T, N, D, typeof(v)}([d], v, map(String, c), m; args...)
+ unchecked=false) where {T, N, D <: TimeType, S <: AbstractString} =
+ TimeArray{T, N, D, typeof(v)}([d], v, map(String, c), m, unchecked)
Tests, sure, you mean for the new unchecked path? Other than that there's no new functionality.
Regards the defaulted regular argument vs keyword argument, I can benchmark, but I expect it to make very marginal performance difference given the context of everything else going on (e.g., replace_dupes()
), and I to my mind the API is much worse. Unnamed boolean arguments don't tell you much at the call site, and if you were to later extend the API it could get messy. Happy to do whatever is desired, but this doesn't seem the best trade off.
Unnamed boolean arguments don't tell you much at the call site, and if you were to later extend the API it could get messy. Happy to do whatever is desired, but this doesn't seem the best trade off.
I expect that user usually call the outer constructor instead of inner constructor, so the positional argument of inner constructor is fine for me. User still has TimeArray(.., unchecked = true)
.
But, yeah, as you said, it's only very marginal performance difference. If you do not want to change it, I'm ok.
Cool, if we can get this change in, I'll be rebasing and preparing a PR for optimizing the merge operation next, a more interesting change ;)
Cool, if we can get this change in
sorry, you mean this PR or my patch?
This PR
I've been trying to use
TimeSeries.jl
on some high frequency datasets and ran into some performance bottlenecks inTimeArray
construction. In particular the uniqueness test is heavy.I've made a couple of changes here that help my use cases:
Rough performance improvements for something on the scale of my use case, 10 million entries, 1 column:
If these changes look reasonable I have a few more I'd like to add. To extend this I'd change some of the algorithms to construct
TimeArray
objects unchecked. I also noticed a few places that don't handleFloat32
properly, and I'd like to rectify that.