JinLabBioinfo / DeepLoop

DeepLoop robustly identifies enhancer-promoter interactions from low-depth and single-cell Hi-C data
Other
28 stars 1 forks source link

Fail in some chromosomes when runing LoopEnhance #7

Open zhuyezhang opened 1 month ago

zhuyezhang commented 1 month ago

When I used LoopEnhance, I got errors when dealing with some chromosome, but worked for other chromosomes. Besides, when I use the parameter "--small_matrix_size 64 --step_size 64" rather than "--small_matrix_size 128 --step_size 128", it also worked. But the small matrix size in training should be 128. Should the parameter be consistent in training and predicting? Thanks for your help!

When running with the parameter "--small_matrix_size 128 --step_size 128", I got the errors as follows:

Traceback (most recent call last): File "/share/home/shenlab/zhuyezhang/tools/DeepLoop-master/prediction/predict_chromosome.py", line 213, in dummy, max_dist, val_cols, keep_zeros) File "/share/home/shenlab/zhuyezhang/tools/DeepLoop-master/prediction/predict_chromosome.py", line 151, in predict_and_write keep_zeros=keep_zeros) File "/share/home/shenlab/zhuyezhang/tools/DeepLoop-master/prediction/predict_chromosome.py", line 69, in sparse_prediction_from_file tile = matrix[rows, cols].A # split matrix into tiles File "/share/home/shenlab/zhuyezhang/miniconda3/envs/deeploop/lib/python3.5/site-packages/scipy/sparse/csr.py", line 304, in getitem return self._get_submatrix(row, col) File "/share/home/shenlab/zhuyezhang/miniconda3/envs/deeploop/lib/python3.5/site-packages/scipy/sparse/csr.py", line 447, in _get_submatrix check_bounds(i0, i1, M) File "/share/home/shenlab/zhuyezhang/miniconda3/envs/deeploop/lib/python3.5/site-packages/scipy/sparse/csr.py", line 443, in check_bounds " %d <= %d" % (i0, num, i1, num, i0, i1)) IndexError: index out of bounds: 0 <= 975 <= 1024, 0 <= 79 <= 1024, 975 <= 79

dylan-plummer commented 2 days ago

The error seems to indicate that it tried to split up a chromosome matrix into 128x128 tiles but the matrix was only 79x79 to begin with. This could be because the chromosome is smaller than expected (the current prediction script assumes the dense matrix is at least 1024x1024). Does it seem to fail on small chromosomes specifically? What genome is your data mapped to?

Since the DeepLoop models are fully convolutional, using a smaller matrix size than training might be ok, but you may find padding artifacts around the edges. You can mitigate this by overlapping the sliding window and setting the step size to a fraction of the small matrix size (e.g --small_matrix_size 128 --step_size 64)