stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 45 forks source link

trend filtering version 2 #71

Closed KaiqianZhang closed 5 years ago

KaiqianZhang commented 5 years ago

I revised trend filtering to make it work without forming any matrix. Thanks!!

codecov-io commented 5 years ago

Codecov Report

Merging #71 into master will increase coverage by 1.02%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #71      +/-   ##
=========================================
+ Coverage   62.87%   63.9%   +1.02%     
=========================================
  Files          20      20              
  Lines         660     687      +27     
=========================================
+ Hits          415     439      +24     
- Misses        245     248       +3
Impacted Files Coverage Δ
R/update_each_effect.R 100% <ø> (ø) :arrow_up:
R/elbo.R 100% <100%> (ø) :arrow_up:
R/trend_filtering_multiplication.R 82.14% <100%> (+36.68%) :arrow_up:
R/sparse_multiplication.R 100% <100%> (+13.04%) :arrow_up:
R/susie_trendfilter.R 100% <100%> (ø) :arrow_up:
R/susie.R 78.57% <100%> (ø) :arrow_up:
R/susie_utils.R 42.54% <0%> (-3.87%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update dc42e47...2780824. Read the comment docs.

stephens999 commented 5 years ago

Can you add some unit tests for the compute_tf functions? I think that may also help clarify exactly what they are doing!

KaiqianZhang commented 5 years ago

(1) My trend filter matrix is the inverse of D^{(k+1)}, where k is the order of trend filtering, D^{(1)} is the {-1,1} banded matrix, and D^{(k+1)} = D^{(k)} * D^{(1)}. Should I explicitly describe it in the documentation?

(2) I will fix the typo!

(3) This is actually by my experiments and observation. I notice that given X is a trend filter matrix I described above, colSums(X.scaledX.scaled) is always rep(n-1, n) no matter what the order is. The exception is when order=0, it is slightly different (last entry is 0). For example, say n = 10 (n is the number of column/row of X). When order is 0, colSums(X.scaledX.scaled) = [9,9,9,9,9,9,9,9,9,0]. And when order is 1,2,3,..., the result is always [9,9,9,9,9,9,9,9,9,9].

On Wed, Oct 31, 2018 at 6:33 PM stephens999 notifications@github.com wrote:

@stephens999 commented on this pull request.

I'm not quite confident I fully understand. Is there somewhere where you describe what you consider the trend filter matrix is?

In R/susie_trendfilter.R https://github.com/stephenslab/susieR/pull/71#discussion_r229900117:

' @param order a scalar for the order of trend filtering

' @return SuSiE fit for trend filtering

' @examples

-#' y = c(rep(0,5),rep(1,5),rep(3,5),rep(-2,5),rep(0,30)) + rnorm(50) -#' susie_trendfilter(y, 0) -#' susie_trendfilter(y, 1, L=20) -#' susie_trenndfilter(y, 0, estimate_prior_variance = TRUE) +#' set.seed(1) +#' mu = c(rep(0,5),rep(1,5),rep(3,5),rep(-2,5),rep(0,30)) +#' y = mu + rnorm(50) +#' s = susie_trendfilter(y, 0) +#' plot(y,pch=".") +#' lines(mu,col=1,lwd=3) +#' lines(predict(s),col=2,lwd=2) +#' s0 = susie_trenndfilter(y, 0, estimate_prior_variance = TRUE)

Typo!

In R/trend_filtering_multiplication.R https://github.com/stephenslab/susieR/pull/71#discussion_r229900511:

+#' @keywords internal +compute_tf_csd = function(order, n){

  • cm = compute_tf_cm(order, n)
  • csd = sqrt((compute_tf_d(order,n)/n - cm^2)*n/(n-1))
  • csd[which(csd==0)]=1
  • return(csd) +}
  • +#' @title Compute colSums(X * X), where X is a standardized trend filtering matrix +#' @param order is the order of trend filtering +#' @param n the length of y +#' @return an n vector +#' @keywords internal +compute_tf_std_d = function(order, n){

  • res = rep(n-1, n)
  • if (order==0) res[n] = 0

Why is this a special case?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/pull/71#pullrequestreview-170510817, or mute the thread https://github.com/notifications/unsubscribe-auth/AYMf7_VkUbNuLw2Vv5npfPj9blYmrS09ks5uqjNRgaJpZM4YDJn- .

KaiqianZhang commented 5 years ago

Sure. I will write unit tests and better document those compute_tf functions.

On Wed, Oct 31, 2018 at 6:35 PM stephens999 notifications@github.com wrote:

Can you add some unit tests for the compute_tf functions? I think that may also help clarify exactly what they are doing!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/pull/71#issuecomment-434883251, or mute the thread https://github.com/notifications/unsubscribe-auth/AYMf780lP9kjGtOeqDMBXRurWwGRwGSsks5uqjPZgaJpZM4YDJn- .

stephens999 commented 5 years ago

That is very helpful. Yes please describe in a document. What is x.scaled? I recall that D is not invertible, as conventionally defined, so that also requires some more explanation.

On Wed, Oct 31, 2018, 20:32 KaiqianZhang notifications@github.com wrote:

(1) My trend filter matrix is the inverse of D^{(k+1)}, where k is the order of trend filtering, D^{(1)} is the {-1,1} banded matrix, and D^{(k+1)} = D^{(k)} * D^{(1)}. Should I explicitly describe it in the documentation?

(2) I will fix the typo!

(3) This is actually by my experiments and observation. I notice that given X is a trend filter matrix I described above, colSums(X.scaledX.scaled) is always rep(n-1, n) no matter what the order is. The exception is when order=0, it is slightly different (last entry is 0). For example, say n = 10 (n is the number of column/row of X). When order is 0, colSums(X.scaledX.scaled) = [9,9,9,9,9,9,9,9,9,0]. And when order is 1,2,3,..., the result is always [9,9,9,9,9,9,9,9,9,9].

On Wed, Oct 31, 2018 at 6:33 PM stephens999 notifications@github.com wrote:

@stephens999 commented on this pull request.

I'm not quite confident I fully understand. Is there somewhere where you describe what you consider the trend filter matrix is?

In R/susie_trendfilter.R https://github.com/stephenslab/susieR/pull/71#discussion_r229900117:

' @param order a scalar for the order of trend filtering

' @return SuSiE fit for trend filtering

' @examples

-#' y = c(rep(0,5),rep(1,5),rep(3,5),rep(-2,5),rep(0,30)) + rnorm(50) -#' susie_trendfilter(y, 0) -#' susie_trendfilter(y, 1, L=20) -#' susie_trenndfilter(y, 0, estimate_prior_variance = TRUE) +#' set.seed(1) +#' mu = c(rep(0,5),rep(1,5),rep(3,5),rep(-2,5),rep(0,30)) +#' y = mu + rnorm(50) +#' s = susie_trendfilter(y, 0) +#' plot(y,pch=".") +#' lines(mu,col=1,lwd=3) +#' lines(predict(s),col=2,lwd=2) +#' s0 = susie_trenndfilter(y, 0, estimate_prior_variance = TRUE)

Typo!

In R/trend_filtering_multiplication.R https://github.com/stephenslab/susieR/pull/71#discussion_r229900511:

+#' @keywords internal +compute_tf_csd = function(order, n){

  • cm = compute_tf_cm(order, n)
  • csd = sqrt((compute_tf_d(order,n)/n - cm^2)*n/(n-1))
  • csd[which(csd==0)]=1
  • return(csd) +}
  • +#' @title Compute colSums(X * X), where X is a standardized trend filtering matrix +#' @param order is the order of trend filtering +#' @param n the length of y +#' @return an n vector +#' @keywords internal +compute_tf_std_d = function(order, n){

  • res = rep(n-1, n)
  • if (order==0) res[n] = 0

Why is this a special case?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/stephenslab/susieR/pull/71#pullrequestreview-170510817 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AYMf7_VkUbNuLw2Vv5npfPj9blYmrS09ks5uqjNRgaJpZM4YDJn-

.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/pull/71#issuecomment-434901516, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xbMtDArFIHjLt5-LPp9NMBFUgE5Oks5uqk8TgaJpZM4YDJn- .

KaiqianZhang commented 5 years ago

Yes. I will write up a report with math proofs on trend filtering, including how to solve the problem that D is not invertible.

What is X.scaled? Sorry, I was not clear about X.scaled. It is X matrix after centering and scaling, i.e. X.scaled = safe_colScale(X).

On Thu, Nov 1, 2018, 09:24 stephens999 <notifications@github.com wrote:

That is very helpful. Yes please describe in a document. What is x.scaled? I recall that D is not invertible, as conventionally defined, so that also requires some more explanation.

On Wed, Oct 31, 2018, 20:32 KaiqianZhang notifications@github.com wrote:

(1) My trend filter matrix is the inverse of D^{(k+1)}, where k is the order of trend filtering, D^{(1)} is the {-1,1} banded matrix, and D^{(k+1)} = D^{(k)} * D^{(1)}. Should I explicitly describe it in the documentation?

(2) I will fix the typo!

(3) This is actually by my experiments and observation. I notice that given X is a trend filter matrix I described above, colSums(X.scaledX.scaled) is always rep(n-1, n) no matter what the order is. The exception is when order=0, it is slightly different (last entry is 0). For example, say n = 10 (n is the number of column/row of X). When order is 0, colSums(X.scaledX.scaled) = [9,9,9,9,9,9,9,9,9,0]. And when order is 1,2,3,..., the result is always [9,9,9,9,9,9,9,9,9,9].

On Wed, Oct 31, 2018 at 6:33 PM stephens999 notifications@github.com wrote:

@stephens999 commented on this pull request.

I'm not quite confident I fully understand. Is there somewhere where you describe what you consider the trend filter matrix is?

In R/susie_trendfilter.R https://github.com/stephenslab/susieR/pull/71#discussion_r229900117:

' @param order a scalar for the order of trend filtering

' @return SuSiE fit for trend filtering

' @examples

-#' y = c(rep(0,5),rep(1,5),rep(3,5),rep(-2,5),rep(0,30)) + rnorm(50) -#' susie_trendfilter(y, 0) -#' susie_trendfilter(y, 1, L=20) -#' susie_trenndfilter(y, 0, estimate_prior_variance = TRUE) +#' set.seed(1) +#' mu = c(rep(0,5),rep(1,5),rep(3,5),rep(-2,5),rep(0,30)) +#' y = mu + rnorm(50) +#' s = susie_trendfilter(y, 0) +#' plot(y,pch=".") +#' lines(mu,col=1,lwd=3) +#' lines(predict(s),col=2,lwd=2) +#' s0 = susie_trenndfilter(y, 0, estimate_prior_variance = TRUE)

Typo!

In R/trend_filtering_multiplication.R https://github.com/stephenslab/susieR/pull/71#discussion_r229900511:

+#' @keywords internal +compute_tf_csd = function(order, n){

  • cm = compute_tf_cm(order, n)
  • csd = sqrt((compute_tf_d(order,n)/n - cm^2)*n/(n-1))
  • csd[which(csd==0)]=1
  • return(csd) +}
  • +#' @title Compute colSums(X * X), where X is a standardized trend filtering matrix +#' @param order is the order of trend filtering +#' @param n the length of y +#' @return an n vector +#' @keywords internal +compute_tf_std_d = function(order, n){

  • res = rep(n-1, n)
  • if (order==0) res[n] = 0

Why is this a special case?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/stephenslab/susieR/pull/71#pullrequestreview-170510817

, or mute the thread <

https://github.com/notifications/unsubscribe-auth/AYMf7_VkUbNuLw2Vv5npfPj9blYmrS09ks5uqjNRgaJpZM4YDJn-

.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/pull/71#issuecomment-434901516, or mute the thread < https://github.com/notifications/unsubscribe-auth/ABt4xbMtDArFIHjLt5-LPp9NMBFUgE5Oks5uqk8TgaJpZM4YDJn-

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/pull/71#issuecomment-435047765, or mute the thread https://github.com/notifications/unsubscribe-auth/AYMf74f2yMe4Q58ZFpLc_ewOY14y8TXxks5uqwQygaJpZM4YDJn- .

gaow commented 5 years ago

Closing this obsolete pull request. A new one is to be made soon.