deeptools / HiCExplorer

HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
https://hicexplorer.readthedocs.org
GNU General Public License v3.0
231 stars 70 forks source link

hicPCA AttributeError : 'list' object has no attribute 'real' #655

Closed shiyi-pan closed 3 years ago

shiyi-pan commented 3 years ago

Hi, I used hicPCA to analysis my hic data, here is my code:

hicPCA -m hic_corrected.h5 --outputFileName pca1.bw pca2.bw --format bigwig --pearsonMatrix pearson.h5 --method dist_norm --obsexpMatrix obs_exp --extraTrack NN1138.Chrosome.gene.sorted.bed

my bed file format is here:

Chr01 150057 150620 NN01g00001.1 0 - 150057 150620 0 2 169,80, 0,483, Chr01 238606 249703 NN01g00002.1 0 + 238606 249703 0 3 2160,251,31, 0,9261,11066, Chr01 258601 264467 NN01g00003.1 0 - 258601 264467 0 9 330,51,307,119,65,180,211,48,141, 0,756,855,1195,1400,2106,5207,5605,5725, Chr01 264993 267366 NN01g00004.1 0 + 264993 267366 0 4 34,65,55,290, 0,474,775,2083,

and I met an error:

Traceback (most recent call last): File "/ds3512/home/panyp/ruanjian/python36/bin/hicPCA", line 7, in main() File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 334, in main vecs_list = correlateEigenvectorWithGeneTrack(ma, vecs_list, args.extraTrack) File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 175, in correlateEigenvectorWithGeneTrack _correlation = pearsonr(eigenvector[bin_id[0]:bin_id[1]].real, AttributeError: 'list' object has no attribute 'real'

could you help me fix this error ? Thank you very much .

LeilyR commented 3 years ago

which version are you using?

shiyi-pan commented 3 years ago

Thank you for your reply , LeiliR . The version I using is 3.4.1.

LeilyR commented 3 years ago

Did you get this warning message: "Number of fields in BED file is not standard. Assuming bed6." ? Are you sure that the chr names of your matrix are the same? e.g. Chr01

joachimwolff commented 3 years ago

Thank you for your reply , LeiliR . The version I using is 3.4.1.

Please update to HiCExplorer version 3.6.

shiyi-pan commented 3 years ago

Thank you for your reply both , LeiliR and joachimwolff. there is no warning message as you described. because the hic_matrix.h5 is binary file , I can't check it ,but the chr names of my reference genome is "Chr01" format , so I think the matrix are same. I will try to update to HiCExplorer version 3.6. Thank you again.

joachimwolff commented 3 years ago

Please check the output of hicInfo -m yourmatrix.h5. It will tell you what chromosome names you used to create your Hi-C interaction matrix.

The given format of the chromosome names does not follow the usual two standards of UCSC (chromosome 1 is named: chr1) or Ensembl (chromosome 1 is named: 1); see at UCSC. Please change your data to one of the two standards.

joachimwolff commented 3 years ago

@shiyi-pan Please check our develop branch if this provided fix solves your issue.

Best,

Joachim

shiyi-pan commented 3 years ago

Sorry for reply so late. I tried to download the latest HiCExplorer and failed many times. now I upgrade the HiCExplorer and it don't fix the problem. here is my Matrix information, it seems normal.

Matrix information file. Created with HiCExplorer's hicInfo version 3.6

File: hic_corrected.h5 Size: 96,060 Bin_length: 10000 Sum of matrix: 34285422.90995298 Chromosomes:length: Chr01: 57180813 bp; Chr02: 49974985 bp; Chr03: 47135343 bp; Chr04: 50772388 bp; Chr05: 40763724 bp; Chr06: 49765926 bp; Chr07: 44477389 bp; Chr08: 47918604 bp; Chr09: 48783406 bp; Chr10: 52870952 bp; Chr11: 39316363 bp; Chr12: 40945015 bp; Chr13: 45260557 bp; Chr14: 49694925 bp; Chr15: 52150659 bp; Chr16: 37044301 bp; Chr17: 41996393 bp; Chr18: 58227433 bp; Chr19: 49303783 bp; Chr20: 47338249 bp; Scaffold_1: 1376331 bp; Scaffold_10: 79343 bp; Scaffold_100: 37068 bp; Scaffold_101: 42738 bp; Scaffold_102: 26844 bp; Scaffold_103: 228794 bp; Scaffold_104: 32224 bp; Scaffold_105: 64192 bp; Scaffold_106: 18086 bp; Scaffold_107: 24298 bp; Scaffold_108: 32567 bp; Scaffold_109: 26501 bp; Scaffold_11: 68995 bp; Scaffold_110: 30230 bp; Scaffold_111: 32186 bp; Scaffold_112: 29075 bp; Scaffold_113: 44707 bp; Scaffold_114: 31011 bp; Scaffold_115: 53279 bp; Scaffold_116: 50853 bp; Scaffold_117: 32689 bp; Scaffold_118: 22740 bp; Scaffold_119: 43728 bp; Scaffold_12: 82648 bp; Scaffold_120: 29361 bp; Scaffold_121: 77151 bp; Scaffold_122: 24454 bp; Scaffold_123: 27786 bp; Scaffold_124: 24495 bp; Scaffold_125: 25183 bp; Scaffold_126: 72182 bp; Scaffold_127: 53744 bp; Scaffold_128: 26293 bp; Scaffold_129: 37480 bp; Scaffold_13: 100901 bp; Scaffold_130: 26616 bp; Scaffold_131: 47216 bp; Scaffold_132: 25511 bp; Scaffold_133: 38574 bp; Scaffold_134: 39819 bp; Scaffold_135: 31471 bp; Scaffold_136: 68622 bp; Scaffold_137: 35934 bp; Scaffold_138: 27115 bp; Scaffold_139: 30448 bp; Scaffold_14: 49877 bp; Scaffold_140: 103730 bp; Scaffold_141: 25958 bp; Scaffold_142: 74686 bp; Scaffold_143: 34985 bp; Scaffold_144: 48180 bp; Scaffold_145: 145388 bp; Scaffold_146: 34480 bp; Scaffold_147: 84396 bp; Scaffold_148: 95665 bp; Scaffold_149: 45280 bp; Scaffold_15: 65928 bp; Scaffold_150: 35715 bp; Scaffold_151: 31258 bp; Scaffold_152: 28092 bp; Scaffold_153: 27805 bp; Scaffold_154: 27766 bp; Scaffold_155: 24449 bp; Scaffold_156: 20304 bp; Scaffold_157: 109821 bp; Scaffold_158: 21654 bp; Scaffold_159: 24454 bp; Scaffold_16: 34092 bp; Scaffold_17: 23083 bp; Scaffold_18: 85037 bp; Scaffold_19: 52024 bp; Scaffold_2: 926866 bp; Scaffold_20: 37941 bp; Scaffold_21: 77000 bp; Scaffold_22: 28139 bp; Scaffold_23: 27054 bp; Scaffold_24: 24918 bp; Scaffold_25: 29551 bp; Scaffold_26: 29820 bp; Scaffold_27: 21547 bp; Scaffold_28: 31000 bp; Scaffold_29: 60459 bp; Scaffold_3: 54977 bp; Scaffold_30: 14929 bp; Scaffold_31: 12941 bp; Scaffold_32: 12410 bp; Scaffold_33: 12045 bp; Scaffold_34: 10478 bp; Scaffold_35: 9797 bp; Scaffold_36: 7715 bp; Scaffold_37: 4901 bp; Scaffold_38: 4038 bp; Scaffold_39: 5000 bp; Scaffold_4: 28354 bp; Scaffold_40: 25000 bp; Scaffold_41: 50000 bp; Scaffold_42: 50000 bp; Scaffold_43: 50000 bp; Scaffold_44: 50000 bp; Scaffold_45: 25000 bp; Scaffold_46: 25000 bp; Scaffold_47: 25000 bp; Scaffold_48: 25000 bp; Scaffold_49: 25000 bp; Scaffold_5: 29883 bp; Scaffold_50: 25000 bp; Scaffold_51: 25000 bp; Scaffold_52: 25000 bp; Scaffold_53: 5000 bp; Scaffold_54: 25000 bp; Scaffold_55: 50000 bp; Scaffold_56: 34282 bp; Scaffold_57: 47162 bp; Scaffold_58: 25997 bp; Scaffold_59: 19013 bp; Scaffold_6: 40904 bp; Scaffold_60: 25693 bp; Scaffold_61: 84331 bp; Scaffold_62: 37296 bp; Scaffold_63: 46469 bp; Scaffold_64: 27074 bp; Scaffold_65: 17918 bp; Scaffold_66: 55362 bp; Scaffold_67: 37606 bp; Scaffold_68: 33523 bp; Scaffold_69: 44903 bp; Scaffold_7: 32595 bp; Scaffold_70: 28561 bp; Scaffold_71: 36620 bp; Scaffold_72: 60085 bp; Scaffold_73: 105810 bp; Scaffold_74: 23713 bp; Scaffold_75: 31008 bp; Scaffold_76: 26196 bp; Scaffold_77: 23870 bp; Scaffold_78: 22335 bp; Scaffold_79: 45524 bp; Scaffold_8: 51638 bp; Scaffold_80: 45152 bp; Scaffold_81: 35854 bp; Scaffold_82: 33673 bp; Scaffold_83: 69659 bp; Scaffold_84: 96453 bp; Scaffold_85: 47017 bp; Scaffold_86: 30893 bp; Scaffold_87: 25513 bp; Scaffold_88: 23061 bp; Scaffold_89: 66364 bp; Scaffold_9: 50400 bp; Scaffold_90: 66484 bp; Scaffold_91: 44568 bp; Scaffold_92: 24224 bp; Scaffold_93: 45891 bp; Scaffold_94: 95094 bp; Scaffold_95: 53584 bp; Scaffold_96: 27959 bp; Scaffold_97: 37118 bp; Scaffold_98: 37684 bp; Scaffold_99: 62049 bp; Non-zero elements: 108,045,174 Minimum (non zero): 2.0425253050862239e-07 Maximum: 713.8334980460619 NaN bins: 0

here is my bed file:

NN1138.Chrosome.gene.sorted.bed.zip

LeilyR commented 3 years ago

Are you using the develop branch? Please use the develop branch , we have fixed it there. you can use git clone -b develop <repo-url> and then python setup.py install in your conda env

shiyi-pan commented 3 years ago

Hi, sorry for reply late. I have installed the develop branch and it doesn't work . Do I bed file wrong or something else wrong ? Thank you very much, LeilyR and Joachim.

joachimwolff commented 3 years ago

Can you specify: it doesn't work?

shiyi-pan commented 3 years ago

Hi,thank you for your reply. I download the HiCExplorer with the comoand : git clone -b develop https://github.com/deeptools/HiCExplorer.git and install it. then I run the command: python /ds3512/home/panyp/ruanjian/python36/bin/hicPCA -m hic_corrected.h5 --outputFileName pca1.bw pca2.bw --format bigwig --pearsonMatrix pearson.h5 --method dist_norm --obsexpMatrix obs_exp --extraTrack NN1138.gene.sorted.bed

and met the error : ERROR:hicmatrix.HiCMatrix:Index error Traceback (most recent call last): File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicmatrix/HiCMatrix.py", line 262, in getRegionBinRange endbin = sorted(self.interval_trees[chrname][endpos:endpos + 1])[0].data IndexError: list index out of range Traceback (most recent call last): File "/ds3512/home/panyp/ruanjian/python36/bin/hicPCA", line 7, in main() File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 338, in main vecs_list = correlateEigenvectorWithGeneTrack(ma, vecs_list, args.extraTrack) File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 162, in correlateEigenvectorWithGeneTrack gene_occurrence[bin_id[1]] += 1 TypeError: 'NoneType' object is not subscriptable

here is my bed file:

NN1138.Chrosome.gene.sorted.bed.zip and here is the information of my hic_corrected.h5 file:

Matrix information file. Created with HiCExplorer's hicInfo version 3.6

File: hic_corrected.h5 Size: 96,060 Bin_length: 10000 Sum of matrix: 34285422.90995298 Chromosomes:length: Chr01: 57180813 bp; Chr02: 49974985 bp; Chr03: 47135343 bp; Chr04: 50772388 bp; Chr05: 40763724 bp; Chr06: 49765926 bp; Chr07: 44477389 b p; Chr08: 47918604 bp; Chr09: 48783406 bp; Chr10: 52870952 bp; Chr11: 39316363 bp; Chr12: 40945015 bp; Chr13: 45260557 bp; Chr14: 49694925 bp; Chr15: 5215065 9 bp; Chr16: 37044301 bp; Chr17: 41996393 bp; Chr18: 58227433 bp; Chr19: 49303783 bp; Chr20: 47338249 bp; Scaffold_1: 1376331 bp; Scaffold_10: 79343 bp; Scaf fold_100: 37068 bp; Scaffold_101: 42738 bp; Scaffold_102: 26844 bp; Scaffold_103: 228794 bp; Scaffold_104: 32224 bp; Scaffold_105: 64192 bp; Scaffold_106: 18 086 bp; Scaffold_107: 24298 bp; Scaffold_108: 32567 bp; Scaffold_109: 26501 bp; Scaffold_11: 68995 bp; Scaffold_110: 30230 bp; Scaffold_111: 32186 bp; Scaffo ld_112: 29075 bp; Scaffold_113: 44707 bp; Scaffold_114: 31011 bp; Scaffold_115: 53279 bp; Scaffold_116: 50853 bp; Scaffold_117: 32689 bp; Scaffold_118: 22740 bp; Scaffold_119: 43728 bp; Scaffold_12: 82648 bp; Scaffold_120: 29361 bp; Scaffold_121: 77151 bp; Scaffold_122: 24454 bp; Scaffold123: 27786 bp; Scaffold 124: 24495 bp; Scaffold_125: 25183 bp; Scaffold_126: 72182 bp; Scaffold_127: 53744 bp; Scaffold_128: 26293 bp; Scaffold_129: 37480 bp; Scaffold_13: 100901 bp ; Scaffold_130: 26616 bp; Scaffold_131: 47216 bp; Scaffold_132: 25511 bp; Scaffold_133: 38574 bp; Scaffold_134: 39819 bp; Scaffold_135: 31471 bp; Scaffold_13 6: 68622 bp; Scaffold_137: 35934 bp; Scaffold_138: 27115 bp; Scaffold_139: 30448 bp; Scaffold_14: 49877 bp; Scaffold_140: 103730 bp; Scaffold_141: 25958 bp; Scaffold_142: 74686 bp; Scaffold_143: 34985 bp; Scaffold_144: 48180 bp; Scaffold_145: 145388 bp; Scaffold_146: 34480 bp; Scaffold_147: 84396 bp; Scaffold_148 : 95665 bp; Scaffold_149: 45280 bp; Scaffold_15: 65928 bp; Scaffold_150: 35715 bp; Scaffold_151: 31258 bp; Scaffold_152: 28092 bp; Scaffold_153: 27805 bp; Sc affold_154: 27766 bp; Scaffold_155: 24449 bp; Scaffold_156: 20304 bp; Scaffold_157: 109821 bp; Scaffold_158: 21654 bp; Scaffold_159: 24454 bp; Scaffold_16: 3 4092 bp; Scaffold_17: 23083 bp; Scaffold_18: 85037 bp; Scaffold_19: 52024 bp; Scaffold_2: 926866 bp; Scaffold_20: 37941 bp; Scaffold_21: 77000 bp; Scaffold_2 2: 28139 bp; Scaffold_23: 27054 bp; Scaffold_24: 24918 bp; Scaffold_25: 29551 bp; Scaffold_26: 29820 bp; Scaffold_27: 21547 bp; Scaffold_28: 31000 bp; Scaffo ld_29: 60459 bp; Scaffold_3: 54977 bp; Scaffold_30: 14929 bp; Scaffold_31: 12941 bp; Scaffold_32: 12410 bp; Scaffold_33: 12045 bp; Scaffold_34: 10478 bp; Sca ffold_35: 9797 bp; Scaffold_36: 7715 bp; Scaffold_37: 4901 bp; Scaffold_38: 4038 bp; Scaffold_39: 5000 bp; Scaffold_4: 28354 bp; Scaffold_40: 25000 bp; Scaff old_41: 50000 bp; Scaffold_42: 50000 bp; Scaffold_43: 50000 bp; Scaffold_44: 50000 bp; Scaffold_45: 25000 bp; Scaffold_46: 25000 bp; Scaffold_47: 25000 bp; S caffold_48: 25000 bp; Scaffold_49: 25000 bp; Scaffold_5: 29883 bp; Scaffold_50: 25000 bp; Scaffold_51: 25000 bp; Scaffold_52: 25000 bp; Scaffold_53: 5000 bp; Scaffold_54: 25000 bp; Scaffold_55: 50000 bp; Scaffold_56: 34282 bp; Scaffold_57: 47162 bp; Scaffold_58: 25997 bp; Scaffold_59: 19013 bp; Scaffold_6: 40904 bp; Scaffold_60: 25693 bp; Scaffold_61: 84331 bp; Scaffold_62: 37296 bp; Scaffold_63: 46469 bp; Scaffold_64: 27074 bp; Scaffold_65: 17918 bp; Scaffold_66: 55 362 bp; Scaffold_67: 37606 bp; Scaffold_68: 33523 bp; Scaffold_69: 44903 bp; Scaffold_7: 32595 bp; Scaffold_70: 28561 bp; Scaffold_71: 36620 bp; Scaffold_72: 60085 bp; Scaffold_73: 105810 bp; Scaffold_74: 23713 bp; Scaffold_75: 31008 bp; Scaffold_76: 26196 bp; Scaffold_77: 23870 bp; Scaffold_78: 22335 bp; Scaffol d_79: 45524 bp; Scaffold_8: 51638 bp; Scaffold_80: 45152 bp; Scaffold_81: 35854 bp; Scaffold_82: 33673 bp; Scaffold_83: 69659 bp; Scaffold_84: 96453 bp; Scaf fold_85: 47017 bp; Scaffold_86: 30893 bp; Scaffold_87: 25513 bp; Scaffold_88: 23061 bp; Scaffold_89: 66364 bp; Scaffold_9: 50400 bp; Scaffold_90: 66484 bp; S caffold_91: 44568 bp; Scaffold_92: 24224 bp; Scaffold_93: 45891 bp; Scaffold_94: 95094 bp; Scaffold_95: 53584 bp; Scaffold_96: 27959 bp; Scaffold_97: 37118 b p; Scaffold_98: 37684 bp; Scaffold_99: 62049 bp; Non-zero elements: 108,045,174 Minimum (non zero): 2.0425253050862239e-07 Maximum: 713.8334980460619 NaN bins: 0

could you help me fix this problem ? thank you very much.

14stutzmanav commented 3 years ago

Hi all,

Is there a way to troubleshoot the error message, "TypeError: 'NoneType' object is not subscriptable"? I am also encountering this message when I use hicPCA and specify a gene track as the extra track.

Thank you!

LeilyR commented 3 years ago

could you please tell us which version you are using and what is the command line you try to run?

LeilyR commented 3 years ago

@shiyi-pan The error you get is due to having a gene coordinate which exceeds the length of the chromosome (IndexError: list index out of range). Please check for genes to be from the same genome version of the reference you used to map your hic data. In future we add a check to report a more clear message.

shiyi-pan commented 3 years ago

Thank you for reply me. my hicexplorer version is 3.6 and here is my command: gff3ToGenePred NN1138.gene.gff3 NN1138.gene.GenePred genePredToBed NN1138.gene.GenePred NN1138.gene.bed sort -k1,1 -k2,2n NN1138.gene.bed > NN1138.gene.sorted.bed python hicPCA -m hic_corrected.h5 --outputFileName pca1.bw pca2.bw --format bigwig --pearsonMatrix pearson.h5 --method dist_norm --obsexpMatrix obs_exp --extraTrack NN1138.gene.sorted.bed

I have check the gene coordinate and they aren't exceeds the length of the chromosome. here is my NN1138.gene.sorted.bed file and NN1138-2.v1.0.chrom.sizes file. bed.zip

LeilyR commented 3 years ago

I could already see that you have : Chr01 57147361 57173638 while your chr01 length is 57173637 I have not checked the rest of your chromosomes. There might be more of such cases of exceeding the chromosome length.

Also when --obsexpMatrix add the extension to your matrix name (obs_exp.h5) This should not cause the error you reported but you wont get the matrix otherwise.

shiyi-pan commented 3 years ago

Thank you for your reply. I'm sorry for my mistake. I changed the NN1138-2.v1.0.chrom.sizes file to test my script. the ture length of chr01 is 57180813. Is there are any disagreement between NN1138-2.v1.0.chrom.sizes file and NN1138.gene.sorted.bed file. Thank you again.