ListerLab / HOME

DMR Identification Tool
33 stars 78 forks source link

HOME-pairwise for different numbers of replicate #31

Open cell101 opened 4 years ago

cell101 commented 4 years ago

Hi I'm trying to use HOME for DMR analysis I have 6 replicate for WT and 4 replicate for mutant When I run HOME-pairwise, I got the following error message

[lee@ko44 HOME]$ ${HOME_DMR}/HOME-pairwise -t CG -npp 16 -i ${data}/HOME_DMR_sample_file_CG.txt -o ${data}/HOME_DMR_gz_out --BSSeeker2 --delta 0.2 --minc 5 Traceback (most recent call last): File "/home/lee/NGS/sw/HOME_met/bin/HOME-pairwise", line 4, in import('pkg_resources').run_script('HOME==1.0.0', 'HOME-pairwise') File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 666, in run_script self.require(requires)[0].run_script(script_name, ns) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 1469, in run_script exec(script_code, namespace, namespace) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/HOME-1.0.0-py2.7.egg/EGG-INFO/scripts/HOME-pairwise", line 314, in

File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 498, in parser_f return _read(filepath_or_buffer, kwds) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 285, in _read return parser.read() File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in read ret = self._engine.read(nrows) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 1197, in read data = self._reader.read(nrows) File "pandas/parser.pyx", line 766, in pandas.parser.TextReader.read (pandas/parser.c:7988) File "pandas/parser.pyx", line 788, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:8244) File "pandas/parser.pyx", line 842, in pandas.parser.TextReader._read_rows (pandas/parser.c:8970) File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8838) File "pandas/parser.pyx", line 1833, in pandas.parser.raise_parser_error (pandas/parser.c:22649) pandas.parser.CParserError: Error tokenizing data. C error: Expected 5 fields in line 2, saw 7

I set HOME_DMR_sample_file_CG.txt as mutant /data/path/mutant.r1.CGmap.gz data/path/mutant.r2.CGmap.gz data/path/mutant.r3.CGmap.gz data/path/mutant.r4.CGmap.gz WT /data/path/WT.r1.CGmap.gz /data/path/WT.r2.CGmap.gz /data/path/WT.r3.CGmap.gz /data/path/WT.r4.CGmap.gz /data/path/WT.r5.CGmap.gz /data/path/WT.r6.CGmap.gz

I wonder that HOME does not support samples with different replicate numbers

Do you have any suggestion for this case?

Akanksha2511 commented 4 years ago

Hi, HOME supports samples with different number of replicates, so that's not the issue. Is your HOME_DMR_sample_file_CG.txt tab separated?

If not, please tab separate it.

Also, there seems to be missing slash in the path for replicate 2, 3 and 4 for mutant. Please check the path for the replicates and provide the full path (HOME does not support relative path at the moment).

Thanks, Akanksha

cell101 commented 4 years ago

Hi Akanksha

As your comments, I used full path and tab separated. but It was not working and showed same error message

pandas.parser.CParserError: Error tokenizing data. C error: Expected 5 fields in line 2, saw 7

I tested HOME_DMR_sample_file_CG.txt with blank tab to make same field number. then it works

mutant /data/path/mutant.r1.CGmap.gz data/path/mutant.r2.CGmap.gz data/path/mutant.r3.CGmap.gz data/path/mutant.r4.CGmap.gz {tab} {tab} WT /data/path/WT.r1.CGmap.gz /data/path/WT.r2.CGmap.gz /data/path/WT.r3.CGmap.gz /data/path/WT.r4.CGmap.gz /data/path/WT.r5.CGmap.gz /data/path/WT.r6.CGmap.gz

I don't know why but it works. I think during HOME-pairwise, pandas.parser need same field number for mutant and WT for input.

I also used testcase file with

sample1 /home/lee/NGS/sw/HOME/testcase/CG/sample1_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample1_r2.txt sample2 /home/lee/NGS/sw/HOME/testcase/CG/sample2_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample2_r2.txt /home/lee/NGS/sw/HOME/testcase/CG/sample3_r1.txt

( I added sample3_r1.txt because if use 1 replicate vs 2 replicate, I got following message) error: cannot handle 1 replicate in 1 group and more than 1 in other

Traceback (most recent call last): File "/home/lee/NGS/sw/HOME_met/bin/HOME-pairwise", line 4, in import('pkg_resources').run_script('HOME==1.0.0', 'HOME-pairwise') File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 666, in run_script self.require(requires)[0].run_script(script_name, ns) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 1469, in run_script exec(script_code, namespace, namespace) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/HOME-1.0.0-py2.7.egg/EGG-INFO/scripts/HOME-pairwise", line 314, in

File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 498, in parser_f return _read(filepath_or_buffer, kwds) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 285, in _read return parser.read() File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in read ret = self._engine.read(nrows) File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 1197, in read data = self._reader.read(nrows) File "pandas/parser.pyx", line 766, in pandas.parser.TextReader.read (pandas/parser.c:7988) File "pandas/parser.pyx", line 788, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:8244) File "pandas/parser.pyx", line 842, in pandas.parser.TextReader._read_rows (pandas/parser.c:8970) File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8838) File "pandas/parser.pyx", line 1833, in pandas.parser.raise_parser_error (pandas/parser.c:22649) pandas.parser.CParserError: Error tokenizing data. C error: Expected 3 fields in line 2, saw 4

if I add blank tab to make same filed number, sample1 /home/lee/NGS/sw/HOME/testcase/CG/sample1_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample1_r2.txt {tab} sample2 /home/lee/NGS/sw/HOME/testcase/CG/sample2_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample2_r2.txt /home/lee/NGS/sw/HOME/testcase/CG/sample3_r1.txt

it works Preparing the DMRs from HOME..... GOOD LUCK ! DMRs for sample1_VS_sample2_13 done DMRs for sample1_VS_sample2_10 done DMRs for sample1_VS_sample2_12 done Congratulations the DMRs are ready

I hope this report helps for improving HOME

Akanksha2511 commented 4 years ago

Ok glad it worked. We tested for equal number of replicates and it works perfectly fine but will test it again thanks.

Fred6887 commented 3 years ago

I get the exact same error, but for me the blank tab doesn't work. There is an issue when replicate numbers are not the same. I have 2 replicates for the WT and 4 for the mutant.