Closed Gavin-Lijy closed 1 year ago
Yes, I'm also in dire need of this. Should the -s and -a inputs be np arrays of the shape N x L x 4 just as in the old TF-modisco ?
import numpy as np
task_to_hyp_scores = {}
task_to_scores = {}
onehot_data = {}
task_to_hyp_scoreslist = [el[0] for el in cnnResults[12]['dataset_TP']["npshap"].tolist()]
onehot_data[12] = cnnResults[12]['dataset_TP']["ohs"].tolist()
print("ohs")
display(np.shape(onehot_data[12]))
display(onehot_data[12][0])
task_to_hyp_scores[12] = [np.squeeze(scores) for scores in task_to_hyp_scoreslist]
print("shaps")
display(task_to_hyp_scores[12][0])
display(np.shape(task_to_hyp_scores[12]))
ohePath = f"{wb.resultsFolder}interpretation/nponehotdata12.npy"
np.save(ohePath, np.array(onehot_data[12]))
shapPath = f"{wb.resultsFolder}interpretation/npshapdata12.npy"
np.save(shapPath, np.array(task_to_hyp_scores[12]))
modiscoResultsPath = f"{wb.resultsFolder}interpretation/modisco_results.h5"
!modisco motifs -s {ohePath} -a {shapPath} -n 2000 -o {modiscoResultsPath}
with result
ohs
(161, 300, 4)
array([[1, 0, 0, 0],
[0, 0, 1, 0],
[0, 1, 0, 0],
...,
[0, 0, 0, 1],
[0, 0, 1, 0],
[0, 0, 0, 1]])
shaps
array([[ 1.53195577e-05, 3.58301720e-05, 1.82393234e-05,
5.96044316e-06],
[ 1.10478895e-05, 7.33038779e-05, 3.05729173e-05,
2.67649664e-05],
[-1.47533502e-05, -1.00547937e-04, 1.00632858e-04,
-5.94360919e-05],
...,
[-5.54023399e-05, 1.02747357e-04, -2.64519495e-07,
-7.14052694e-05],
[ 6.13165478e-06, -7.53152633e-06, 6.90066693e-05,
7.19849909e-07],
[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
0.00000000e+00]])
(161, 300, 4)
Traceback (most recent call last):
File "/usr/local/bin/modisco", line 108, in <module>
pos_patterns, neg_patterns = modiscolite.tfmodisco.TFMoDISco(
File "/usr/local/lib/python3.8/dist-packages/modiscolite/tfmodisco.py", line 281, in TFMoDISco
seqlet_coords, threshold = extract_seqlets.extract_seqlets(
File "/usr/local/lib/python3.8/dist-packages/modiscolite/extract_seqlets.py", line 158, in extract_seqlets
pos_null_values, neg_null_values = _laplacian_null(track=smoothed_tracks,
File "/usr/local/lib/python3.8/dist-packages/modiscolite/extract_seqlets.py", line 34, in _laplacian_null
(np.percentile(a=pos_values, q=percentiles_to_use)-mu))
File "<__array_function__ internals>", line 5, in percentile
File "/usr/local/lib/python3.8/dist-packages/numpy/lib/function_base.py", line 3867, in percentile
return _quantile_unchecked(
File "/usr/local/lib/python3.8/dist-packages/numpy/lib/function_base.py", line 3986, in _quantile_unchecked
r, k = _ureduce(a, func=_quantile_ureduce_func, q=q, axis=axis, out=out,
File "/usr/local/lib/python3.8/dist-packages/numpy/lib/function_base.py", line 3564, in _ureduce
r = func(a, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/numpy/lib/function_base.py", line 4098, in _quantile_ureduce_func
n = np.isnan(ap[-1])
IndexError: index -1 is out of bounds for axis 0 with size 0
If you're using the numpy array inputs the shape should be (batch, 4, length). Sorry for the inconsistency. I'm going to be adding more documentation soon once I finish something else.
Hi Jacob,
Thanks for your fast response. Got it to work with the dimensions you mentioned in your answer. Thanks for your work !
Sorry, I didn't find the toy data you mentioned, could you kindly provide it to help us understand the input?