aidenlab / straw

Extract data quickly from Juicebox via straw
MIT License
61 stars 36 forks source link

error: File doesn't have the given chr_chr map. #132

Open charlottewright opened 7 months ago

charlottewright commented 7 months ago

Hello,

When I run straw('KR', hic_file, 'LR994552.1', 'LR994551.1','BP', res)) on a .hic file generated by yahs, the command runs fine.

However, when I try it on a .hic file which was converted from a cool file I am getting an error:

row, col, data = map(np.array, straw('KR', hic_file, 'LR994552.1', 'LR994551.1','BP', res))
File doesn't have the given chr_chr map.  

I don't think that chromosome names are the issue because when I try using the wrong chromosome names, I get a different error:

row, col, data = map(np.array, straw('KR', hic_file, '1.1', '1.1','BP', res))
One of the chromosomes wasn't found in the file. Check that the chromosome name matches the genome.

I've also tried replacing all chromosome names in both the matrix and chr_sizes file before generating the .hic file e.g. 'chr1', 'chr2' and '1','2' and then using these in the straw command. I've also tried sorting the chr_sizes file (in both directions) before generating the .hic file.

The hic file which is giving this error was generated from a cool file which was transformed like so: hicTransform --matrix run1.40000.cool --outFileName run1.40000.pearson_correlation.cool --method pearson

..and then turned into a .hic file like so:

python3 cool2hic.py -i run1.40000.pearson_correlation.cool -r 40000 # generates 'matrix.txt'
 java -jar -Xmx100G ~/software/juicer_tools/juicer_tools_1.22.01.jar pre -r 40000 -d matrix.txt run1.40000.pearson_correlation.hic chrSizes.tsv 

This method comes from https://www.biostars.org/p/360254/. The hic file loads fine in juicebox so it isn't corrupted.

I'm using hic-straw==1.3.1 installed like so python3 -m pip install hic-straw.

Any help in getting straw to read this hic file or alternative ways to obtain a hic file would be greatly appreciated.

Many thanks,

Charlotte