annaorteu / wrath

Wrath: WRapped Analysis of Tagged Haplotypes
GNU General Public License v3.0
8 stars 3 forks source link

Program fails if only one SV automatically detected #9

Open gmkov opened 2 months ago

gmkov commented 2 months ago

When wrath only finds one SV with automatic detection, the program fails.

I noticed this when testing with very large window sizes, for speed.

RUN

chromosome=26
winSize=100000  # big for speed
start=5125000
end=6806000

nohup nice wrath -l \
-g $genome -c $chromosome -w $winSize -a $group -t 14 -s $start -e $end &

OUTPUT

One outlier detected (not really an sv, close to diagonal etc- but not sure there is a filter to get rid of potential artifacts):

nrow,ncol,value,Estimate,Est.Error,Q2.5,Q97.5,upper,lower,z_score
15,16,0.0068236539,0.0101942322343456,0.000238777714817429,0.00748494333822929,0.0129035211304619,FALSE,TRUE,-2.03884654056388

outliers_50000_26_5125000_6806000_47 all brazil haplotagging n93_plot

Heatmap plot is produced but with no SV plotted (maybe just using plot_heatmap.py, as mentioned in https://github.com/annaorteu/wrath/issues/7 ).

No SVs/ table is produced

ERROR

Traceback (most recent call last): File "/rds/user/mgm49/hpc-work/home/bin/wrath/sv_detection/sv_detection_and_heatmap.gmk.py", line 69, in clustering = AgglomerativeClustering(n_clusters=None, distance_threshold=3, linkage='single').fit(points) File "/home/mgm49/miniconda3/lib/python3.8/site-packages/sklearn/base.py", line 1152, in wrapper return fit_method(estimator, *args, kwargs) File "/home/mgm49/miniconda3/lib/python3.8/site-packages/sklearn/cluster/_agglomerative.py", line 978, in fit X = self._validate_data(X, ensure_min_samples=2) File "/home/mgm49/miniconda3/lib/python3.8/site-packages/sklearn/base.py", line 605, in _validate_data out = check_array(X, input_name="X", check_params) File "/home/mgm49/miniconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 967, in check_array raise ValueError( ValueError: Found array with 1 sample(s) (shape=(1, 2)) while a minimum of 2 is required by AgglomerativeClustering. /home/mgm49/rds/hpc-work/home/bin/wrath/wrath.gmk: line 290: Detecting SVs and plotting of matrix wrath_out/matrices/jaccard_matrix_100000_26_5125000_6806000_47.all.brazil.haplotagging.n93.txt step failed: No such file or directory

SOLUTION

Not sure, not sure which function exactly fails.

This is minor and irrelevant to most, unless the automatic SV function gets developed further to filter for large SVs only, for example, then you might encounter this situation more. Just thought I'd let you know.