ijuric / MAPS

18 stars 11 forks source link

String index out of range error #2

Open mapsuser1 opened 5 years ago

mapsuser1 commented 5 years ago

Hello,

Below is an error log that I keep getting (some paths redacted). It seems to refer to an index out of range. Do you have any advice on what could be causing this issue?

Traceback (most recent call last): File "/X/MAPS/bin/feather/feather_pipe", line 116, in main() File "/X/MAPS/bin/feather/feather_pipe", line 50, in main split_main(filter_output_filename, outdir, prefix, length_threshold, per_chr_bedpe, generate_hic) File "/X/MAPS/bin/feather/feather_split_rongxin.py", line 25, in split_main autosomal_chrs = [chr_name for chr_name in chr_list if (chrname.find('') == -1 and chr_name[3].isdigit())] File "/X/MAPS/bin/feather/feather_split_rongxin.py", line 25, in autosomal_chrs = [chr_name for chr_name in chr_list if (chrname.find('') == -1 and chr_name[3].isdigit())] IndexError: string index out of range usage: PROG [-h] [--BINNING_RANGE BINNING_RANGE] DATASET_NAME OUT_DIR MACS2_PATH GF_PATH LONG_PATH SHORT_PATH BIN_SIZE N_CHROMS OUT_FILE_PATH PROG: error: the following arguments are required: OUT_FILE_PATH Traceback (most recent call last): File "/X/MAPS/bin/MAPS/MAPS.py", line 225, in main() File "/X/MAPS/bin/MAPS/MAPS.py", line 222, in main init(p) File "/X/MAPS/bin/MAPS/MAPS.py", line 78, in init input_data = pd.read_csv(p.run_file,sep='=',skip_blank_lines=True, comment='#',index_col=0,header=None) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 709, in parser_f return _read(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 449, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 818, in init self._make_engine(self.engine) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1049, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1695, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 402, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 718, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: File b'/X/MAPS_output/sra_data_20181128_193810/maps_sra_data.maps' does not exist Loading required package: methods Loading required package: stats4 Loading required package: splines Error in file(file, "rt") : cannot open the connection Calls: read.table -> file Execution halted Error in file(file, "rt") : cannot open the connection Calls: read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file '/X/MAPS_output/sra_data_20181128_193810/sra_data.5k.2.peaks': No such file or directory Execution halted

mapsuser1 commented 5 years ago

Some additional output that I get from MAPS before the error (again, some paths redacted):

Wed Nov 28 19:38:12 2018 starting mapping and filtering operation Wed Nov 28 19:38:12 2018 calling bwa for /X/sra_data_R1.fastq Wed Nov 28 20:20:22 2018 calling bwa for /X/sra_data_R2.fastq Wed Nov 28 21:05:18 2018 calling samtools sort for /X/feather_output/sra_data_20181128_193810/tempfiles/sra_data_R1.fastq.bwa.sam storing in /X/feather_output/sra_data_20181128_193810/tempfiles/sra_data_R1.fastq.bwa.sam.srtn Wed Nov 28 21:19:10 2018 calling samtools sort for /X/feather_output/sra_data_20181128_193810/tempfiles/sra_data_R2.fastq.bwa.sam storing in /X/feather_output/sra_data_20181128_193810/tempfiles/sra_data_R2.fastq.bwa.sam.srtn Wed Nov 28 21:32:50 2018 merging /X/feather_output/sra_data_20181128_193810/tempfiles/sra_data_R1.fastq.bwa.sam.srtn and /X/feather_output/sra_data_20181128_193810/tempfiles/sra_data_R2.fastq.bwa.sam.srtn Wed Nov 28 22:38:55 2018 filtering and pairing reads Thu Nov 29 00:19:44 2018 paired bam file generated. Sorting by coordinates. Thu Nov 29 00:33:48 2018 calling samtools rmdup Thu Nov 29 01:10:30 2018 calling samtools flagstat on mapped file Thu Nov 29 01:14:50 2018 calling samtools flagstat on mapped and duplicate-removed file Thu Nov 29 01:19:03 2018 calling samtools sort for sorting by query names Thu Nov 29 01:36:40 2018 finishing filtering Thu Nov 29 01:36:40 2018 starting the splitting operation sra_data /X/MAPS_output/sra_data_20181128_193810/ /X/MAPS/bin/../MAPS_data_files/hg19/genomic_features/F_GC_M_MboI_5Kb_el.hg19.txt /X/feather_output/sra_data_current/ /X/feather_output/sra_data_current/ 5000 22 /X/MAPS_output/sra_data_20181128_193810/ first loading parameters file second [1] "/X/MAPS_output/sra_data_20181128_193810/" [2] "sra_data.5k" [3] "5000" [4] "22" [5] "None" chr bin 1 chrNONE -1 [1] "loading chromosome chr1 .and" third

ijuric commented 5 years ago

Hi,

So, the issue might be that we require chromosome names to be "chrXXX", so "chr1", "chr2", and so on. Are you using likely using "1", "2", "3" and so on for you chromosome labels?

On Thu, Nov 29, 2018 at 9:52 AM mapsuser1 notifications@github.com wrote:

Hello,

Below is an error log that I keep getting (some paths redacted). It seems to refer to an index out of range. Do you have any advice on what could be causing this issue?

Traceback (most recent call last): File "/X/MAPS/bin/feather/feather_pipe", line 116, in main() File "/X/MAPS/bin/feather/feather_pipe", line 50, in main split_main(filter_output_filename, outdir, prefix, length_threshold, per_chr_bedpe, generate_hic) File "/X/MAPS/bin/feather/feather_split_rongxin.py", line 25, in split_main autosomal_chrs = [chr_name for chr_name in chr_list if (chr_name.find('

') == -1 and chr_name[3].isdigit())] File "/X/MAPS/bin/feather/feather_split_rongxin.py", line 25, in autosomal_chrs = [chr_name for chr_name in chr_list if (chr_name.find('') == -1 and chr_name[3].isdigit())] IndexError: string index out of range usage: PROG [-h] [--BINNING_RANGE BINNING_RANGE] DATASET_NAME OUT_DIR MACS2_PATH GF_PATH LONG_PATH SHORT_PATH BIN_SIZE N_CHROMS OUT_FILE_PATH PROG: error: the following arguments are required: OUT_FILE_PATH Traceback (most recent call last): File "/X/MAPS/bin/MAPS/MAPS.py", line 225, in main() File "/X/MAPS/bin/MAPS/MAPS.py", line 222, in main init(p) File "/X/MAPS/bin/MAPS/MAPS.py", line 78, in init input_data = pd.read_csv(p.run_file,sep='=',skip_blank_lines=True, comment='#',index_col=0,header=None) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 709, in parser_f return _read(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 449, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 818, in init self._make_engine(self.engine) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1049, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1695, in init self._reader = parsers.TextReader(src, *kwds) File "pandas/_libs/parsers.pyx", line 402, in pandas._libs.parsers.TextReader.cinit* File "pandas/_libs/parsers.pyx", line 718, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: File b'/X/MAPS_output/sra_data_20181128_193810/maps_sra_data.maps' does not exist Loading required package: methods Loading required package: stats4 Loading required package: splines Error in file(file, "rt") : cannot open the connection Calls: read.table -> file Execution halted Error in file(file, "rt") : cannot open the connection Calls: read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file '/X/MAPS_output/sra_data_20181128_193810/sra_data.5k.2.peaks': No such file or directory Execution halted

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ijuric/MAPS/issues/2, or mute the thread https://github.com/notifications/unsubscribe-auth/ALPpZX4F9pthA2Zz4y1eycgsH7OiV-s_ks5uz_SxgaJpZM4Y51jc .

mapsuser1 commented 5 years ago

I used the recommended workflow included in the MAPS documentation:

Could this be causing the issue? If yes, would any bwa setting be able to include "chr" in the chromosome labels?

mapsuser1 commented 5 years ago

I switched to the GRCh38 reference recommended in the MAPS documentation, and the index out of range issue seems to be avoided. However, MAPS fails afterwards with the following error log:

usage: PROG [-h] [--BINNING_RANGE BINNING_RANGE] DATASET_NAME OUT_DIR MACS2_PATH GF_PATH LONG_PATH SHORT_PATH BIN_SIZE N_CHROMS OUT_FILE_PATH PROG: error: the following arguments are required: OUT_FILE_PATH Traceback (most recent call last): File "/X/MAPS/bin/MAPS/MAPS.py", line 225, in main() File "/X/MAPS/bin/MAPS/MAPS.py", line 222, in main init(p) File "/X/MAPS/bin/MAPS/MAPS.py", line 78, in init input_data = pd.read_csv(p.run_file,sep='=',skip_blank_lines=True, comment='#',index_col=0,header=None) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 709, in parser_f return _read(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 449, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 818, in init self._make_engine(self.engine) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1049, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1695, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 402, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 718, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: File b'/X/MAPS_output/sra_data_20181130_092609/maps_sra_data.maps' does not exist Loading required package: methods Loading required package: stats4 Loading required package: splines Error in file(file, "rt") : cannot open the connection Calls: read.table -> file Execution halted Error in file(file, "rt") : cannot open the connection Calls: read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file 'X/MAPS_output/sra_data_20181130_092609/sra_data.5k.2.peaks': No such file or directory Execution halted

Let me know if you have any advice on this error.

ijuric commented 5 years ago

Hi, Hard to say just from that. Maybe some directories are misspecified. Can you send me your run_pipeline.sh script (send it to ivan.juric.gen@gmail.com ) ? Also, I've pushed a version of feather_split_rongxin.py script that deals with chromosome names without 'chr' (Thanks Armen for the fix!). Now it should work with hg19 version too. Can you try it and let me know how it goes?

Also, how many reads do you have in your fastq files?

On Fri, Nov 30, 2018 at 6:30 PM mapsuser1 notifications@github.com wrote:

I switched to the GRCh38 reference recommended in the MAPS documentation, and the index out of range issue seems to be avoided. However, MAPS fails afterwards with the following error log:

usage: PROG [-h] [--BINNING_RANGE BINNING_RANGE] DATASET_NAME OUT_DIR MACS2_PATH GF_PATH LONG_PATH SHORT_PATH BIN_SIZE N_CHROMS OUT_FILE_PATH PROG: error: the following arguments are required: OUT_FILE_PATH Traceback (most recent call last): File "/X/MAPS/bin/MAPS/MAPS.py", line 225, in main() File "/X/MAPS/bin/MAPS/MAPS.py", line 222, in main init(p) File "/X/MAPS/bin/MAPS/MAPS.py", line 78, in init input_data = pd.read_csv(p.run_file,sep='=',skip_blank_lines=True, comment='#',index_col=0,header=None) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 709, in parser_f return _read(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 449, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 818, in init self._make_engine(self.engine) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1049, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/X/.local/lib/python3.4/site-packages/pandas/io/parsers.py", line 1695, in init self._reader = parsers.TextReader(src, *kwds) File "pandas/_libs/parsers.pyx", line 402, in pandas._libs.parsers.TextReader.cinit* File "pandas/_libs/parsers.pyx", line 718, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: File b'/X/MAPS_output/sra_data_20181130_092609/maps_sra_data.maps' does not exist Loading required package: methods Loading required package: stats4 Loading required package: splines Error in file(file, "rt") : cannot open the connection Calls: read.table -> file Execution halted Error in file(file, "rt") : cannot open the connection Calls: read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file 'X/MAPS_output/sra_data_20181130_092609/sra_data.5k.2.peaks': No such file or directory Execution halted

Let me know if you have any advice on this error.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ijuric/MAPS/issues/2#issuecomment-443371125, or mute the thread https://github.com/notifications/unsubscribe-auth/ALPpZaVvaFApdnnw6cojiR-U6uSKmzuqks5u0b-GgaJpZM4Y51jc .

mapsuser1 commented 5 years ago

Hi Ivan,

Thank you for your reply: I have emailed you the information.