vaquerizaslab / fanc

FAN-C: Framework for the ANalysis of C-like data
GNU General Public License v3.0
104 stars 14 forks source link

How to fix "black region" in AB compartment analysis (saddleplot) #96

Open hiroyukikato911 opened 2 years ago

hiroyukikato911 commented 2 years ago

Hi, Thanks for this wonderful tool!

I get this picture with a black region in the middle.

A_250kb_profile_4%

when I run

fanc compartments -e A_100kb_profile_4%.eps \
                 /Volumes/hic/A_inter_30.hic@100kb \
                 A_100kb.ab \
                  -p 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 \
              -s 0 \
              -m A_100kb_matrix_4%.txt \
              --compartment-strength A_100kb_strength_4%.txt \
-c coolwarm \
-g /Users/genomes/hg19/hg19.fa

My questions are as follows.

  1. Is there any way to fix this black region? From the output matrix txtfile, it seems these black regions are scored "nan". Is it reasonable to redraw the figure by deleting these "nan" region?
  2. I used -s 0 function. Is it correct to assume that the border of active "A" and inactive "B" regions match with these black cross?
  3. I used .hic file from juicer. Is this analysis already normalized such as by KR? Or do I need to specify by A_inter_30.hic@100kb@KR instead?

Best regards, Hiroyuki

fanc version is 0.9.22.

kaukrise commented 2 years ago

Hi, and thank you for the nice words!

Regarding your questions:

  1. I have never encountered these black NaN regions myself. Would you have a Juicer Hi-C file, perhaps from 4DN, that I could use to reproduce the issue?
  2. Yes, that is correct - -s 0 divides the EV values into two groups: larger and smaller than 0, which correspond to A and B, respectively. Each group is then divided separately in the percentiles you chose.
  3. For Juicer files up to and including v8 KR norm is chosen by default. For Juicer v9 files SCALE norm is chosen.

Cheers, Kai

hiroyukikato911 commented 2 years ago

Hi,

Thanks for your response! I notice some error messages that might relate to this issue.

When generating .ab files by running fanc compartments /Volumes/A_inter_30.hic@100kb A_100kb.ab

I get this error:

/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/compatibility/juicer.py:619: RuntimeWarning: divide by zero encountered in true_divide return np.array(vectors[resolution]) / sf

Traceback (most recent call last): File "/Users/.pyenv/versions/3.9.1/bin/fanc", line 127, in Fanc() File "/Users/.pyenv/versions/3.9.1/bin/fanc", line 93, in init command([sys.argv[0]] + sys.argv[option_ix:], log_level=log_level, verbosity=verbosity) File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/commands/fanc_commands.py", line 4097, in compartments ab_matrix = ABCompartmentMatrix.from_hic(matrix, File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/architecture/compartments.py", line 148, in from_hic ab_matrix.flush() File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/matrix.py", line 1706, in flush self._flush_edges(silent=silent, update_mappability=update_mappability) File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/matrix.py", line 2355, in _flush_edges RegionPairsTable._flush_edges(self, kwargs) File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/matrix.py", line 1692, in _flush_edges self._enable_edge_indexes() File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/matrix.py", line 1723, in _enable_edge_indexes create_col_index(edge_table.cols.source) File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/tools/general.py", line 188, in create_col_index col.create_index() File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/table.py", line 3564, in create_index idxrows = _columncreate_index(self, optlevel, kind, filters, File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/table.py", line 284, in _columncreate_index index = Index( File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/index.py", line 381, in init super().init(parentnode, name, title, new, filters) File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/group.py", line 221, in init super().init(parentnode, name, _log) File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/node.py", line 258, in init self._g_post_init_hook() File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/index.py", line 535, in _g_post_init_hook self.create_temp() File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/index.py", line 992, in create_temp self.tmpfile = self._openFile(self.tmpfilename, "w") File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/file.py", line 300, in open_file return File(filename, mode, title, root_uep, filters, kwargs) File "/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/tables/file.py", line 750, in init self._g_new(filename, mode, **params) File "tables/hdf5extension.pyx", line 486, in tables.hdf5extension.File._g_new tables.exceptions.HDF5ExtError: HDF5 error back trace

File "H5F.c", line 532, in H5Fcreate unable to create file File "H5VLcallback.c", line 3282, in H5VL_file_create file create failed File "H5VLcallback.c", line 3248, in H5VLfile_create file create failed File "H5VLnative_file.c", line 63, in H5VLnative_file_create unable to create file File "H5Fint.c", line 1858, in H5F_open unable to truncate a file which is already open

End of HDF5 error back trace

Unable to open/create file '/Volumes/FAN-C/pytables-tzfsjlla.tmp' Closing remaining open files:A_100kb_KR.ab...done

Although I get this error, file still exist so I proceeded. My HDF5 version seems pass minimum prerequisite.

HDF5 Version: 1.10.6

Next,when generating saddle plots by running

fanc compartments -e A_100kb_profile_4%.eps \
                 /Volumes/hic/A_inter_30.hic@100kb \
                 A_100kb.ab \
                  -p 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 \
              -s 0 \
              -m A_100kb_matrix_4%.txt \
              --compartment-strength A_100kb_strength_4%.txt \
-c coolwarm \
-g /Users/genomes/hg19/hg19.fa

I get this error:

/Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/architecture/helpers.py:94: UserWarning: Warning: converting a masked element to nan. m[i_bin, j_bin] += value /Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/architecture/helpers.py:96: UserWarning: Warning: converting a masked element to nan. m[j_bin, i_bin] += value /Users/.pyenv/versions/3.9.1/lib/python3.9/site-packages/fanc/architecture/helpers.py:124: RuntimeWarning: invalid value encountered in true_divide m /= c

This .hic files require ethical approval to upload, but my other .hic files are available here. https://ddbj.nig.ac.jp/public/ddbj_database/gea/experiment/E-GEAD-000/E-GEAD-415/E-GEAD-415.processed.zip

(I apologize that the file is around 41.9GB.)

Any help is appreciated. Thanks in advance!

Best, Hiroyuki