Closed pubudumanoj closed 4 years ago
At first glance I'm not sure what the problem is. I've used your example matrix file and it works for me. The two things to pay attention to are 1) Do you get a message saying that is loading your matrix file(s)? This is unfortunately a little tricky to assess as the messages over-write each other as the program steps through each phase of the dataset creation. However, if you pipe the stderr into a file with hifive hic-data -X "test_*.MatA" out_filename.txt output.hic.data 2> hifive.log
then you can look at the log file and see if each file name appears. If not, then there is an issue with the automatic recognition of the file names, which brings me to the second thing to check. 2) Do your chromosome names in the matrix file names match those in your fend.bed file? If your fend.bed chromosome names are 'chr1', 'chr2', etc, then your matrix file names need to be test_chr1.mat, test_chr2.mat, test_chr1_bychr2.mat and the argument you pass should be 'test*.mat'. If this isn't the issue, let me know and we can keep digging.
I renamed the file names as you mentioned and re-ran the code. I removed all the column names and row names from the matrix file (similar to test data set). So the matrix is a square matrix ( number of rows = number of columns). Now I get this error
hifive hic-data -X "test_*.mat" binned.fends output.hic.data
Traceback (most recent call last):
File "/cvmfs/soft.mugqic/CentOS6/software/python/Python-2.7.14/bin/hifive", line 849, in
hifive hic-data -X "test_*.mat" binned.fends output.hic.data 2> log.txt
the produced log file is attached log.txt
When I used a matrix file with column names and row names (as explained in the question) I get this error
Traceback (most recent call last):
File "/cvmfs/soft.mugqic/CentOS6/software/python/Python-2.7.14/bin/hifive", line 849, in
Okay, this is good progress. The new issue is because the file is actually space-separated, not tab separated. If you replace the spaces with tabs, I think everything should work.
Now I get this error
Traceback (most recent call last):
File "/cvmfs/soft.mugqic/CentOS6/software/python/Python-2.7.14/bin/hifive", line 849, in
log file is attached log.txt
I think it will be easy if I attach my matrix file. Because the issue should be in that file. Please check the attached matrix file https://drive.google.com/file/d/1nv_4yqpF-sLrWqXGEXJNcas0b3_bSKJv/view?usp=sharing
Thank you for helping
This was really only intended for loading raw data so it is expecting integer values. The decimals are causing issues in loading the data. One option, since it looks like there are only integers and X.5 values, would be to simply double all of the counts. Also, I would suggest keeping the column and row labels or double checking that the number of rows is equal to the number of bins produced with your chromosome length/bed file and bin size.
Yes it worked when I multiply all values by 2. However if there are values other than 0.5, is it okay to multiply by 10? Does it effect the quality score?
It will have a minor effect on the quality scores. The scale of that impact is going to depend on how sparse your data are. The number of empty bins (which will be different for each resolutIon) will be roughly proportional to the error introduced into the quality score.
If we use the same resolution (bin size) for all the matrices, then the error introduced by the multiplication would be same right?
If the samples are of similar sequencing depth, then the error should be very similar. There is a step that involves a pseudo count addition so the magnitude of the error is going to be influenced by the number of zero bins. However, now that I'm thinking about it carefully, the larger the scaling factor, the smaller the influence of the pseudo count, so I think multiplying by 10 should be fine.
Okay thank you. I will get back to you if I gotten in to any other issue.
Sorry to bother you again. In the next step I got another error.
When I run
mpirun -np 4 hifive hic-project output.hic.data output.hic.project
I get
Traceback (most recent call last):
File "/cvmfs/soft.mugqic/CentOS6/software/python/Python-2.7.14/bin/hifive", line 849, in
Do you have any idea what would be the reason for this?
Thank you
This suggests that mpi4py is not installed.
mpi4y was already installed. I checked with the
test code specified in the documentation
mpirun -np 5 python helloworld.py
and it works fine
I tried several things but still gets the same error
I think I got some clues.
There is an issue with from mpi4py import MPI
then it gets
Traceback (most recent call last):
File "
I will try to resolve this and it will probably resolve the issue. Thank you
I uninstalled mpi4y and its now its working fine with sequential processing
I tried to create a hic-data object using following commands
hifive fends -B fend.bed --binned=50000 out_filename.txt
When I run hifive hic-data -X "test_*.MatA" out_filename.txt output.hic.data I get "Done 0 cis reads, 0 trans reads" in the command line output and I cannot use this file for next steps (which is 6.4kb in size).
I obtained interaction matrices from homer and modified it according to the explanation in the documentation. My structure for one test_*.MatA file is chr1:0-50000 chr1:50000-100000 chr1:100000-150000 chr1:150000-200000 chr1:0-50000 7 0 0 0 chr1:50000-100000 0 11 0 0 chr1:100000-150000 0 0 28 0 chr1:150000-200000 0 0 0 0
The headers are column and row names and file is a TSV file
Can you please guide me how to resolve this issue or whether something wrong with my matrix structure