As of 8f3ed5fa0fc05dad381adda79e2cf502fe9e43bc, the diff expr method drops cube values where (estimators_df["sem"] <= 0) | (estimators_df["sem"] >= estimators_df["mean"]). This pruning prevents mathematical errors when transforming the estimators into log space by avoiding taking log(mean) whenmean <=0and whenlog(mean - sem)whenmean - sem <= 0`.
This drops ~1-3% of cube data, depending upon the query filter specified by the user.
The zero-valued sem cube elements constitute 1.7% of the cube, and are computed from n_obs counts ranging between 1-14:
count 1.875726e+07
mean 1.107691e+00
std 3.761035e-01
min 1.000000e+00
25% 1.000000e+00
50% 1.000000e+00
75% 1.000000e+00
max 1.400000e+01
The sem>mean cube elements constitute 0.1% of the cube, and are computed from n_obs counts ranging primarily between 1-5:
count 1.061575e+06
mean 6.328915e+00
std 1.474041e+01
min 2.000000e+00
25% 2.000000e+00
50% 3.000000e+00
75% 5.000000e+00
max 2.726000e+03
Since these estimator values are computed from low counts of raw expression values, it has been deemed acceptable to drop these values entirely:
[ ] Update the builder to filter out these values. (Note: they will be replaced with minimal values in the diff expr method.
As of 8f3ed5fa0fc05dad381adda79e2cf502fe9e43bc, the diff expr method drops cube values where
(estimators_df["sem"] <= 0) | (estimators_df["sem"] >= estimators_df["mean"])
. This pruning prevents mathematical errors when transforming the estimators into log space by avoiding takinglog(mean) when
mean <=0and when
log(mean - sem)when
mean - sem <= 0`.This drops ~1-3% of cube data, depending upon the query filter specified by the user.
The zero-valued sem cube elements constitute 1.7% of the cube, and are computed from n_obs counts ranging between 1-14:
The sem>mean cube elements constitute 0.1% of the cube, and are computed from n_obs counts ranging primarily between 1-5:
Since these estimator values are computed from low counts of raw expression values, it has been deemed acceptable to drop these values entirely: