Building Index WGBS Error

bacantre commented 7 years ago

Hello,

I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.

I ran the code: python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location

I got the error: Traceback (most recent call last): File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import * ImportError: No module names bs_index.wg_build

I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?

I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.

Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?

Thank you, Bonnie

guoweilong commented 7 years ago

Hi Bonnie,

Acturally, you need to download BS-Seeker2 from the homepage: https://github.com/BSSeeker/BSseeker2 . There are other scripts in the packages are needed.

For your second question, I suggest you unzip the .fa.gz to .fa file, and then build the genome

Best, Weilong

At 2017-11-10 10:57:01, "bacantre" notifications@github.com wrote:

Hello,

I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.

I ran the code: python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location

I got the error: Traceback (most recent call last): File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import * ImportError: No module names bs_index.wg_build

I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?

I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.

Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?

Thank you, Bonnie

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

bacantre commented 6 years ago

Hello Weilong,

I have been in contact with my university and am trying to figure out what went wrong with the download of this. I ended up copying over this entire download and unzipped it. I then ran this code and got back this error:

python bs_seeker2-build.py -f UMD3.1_chromosomes.fa --aligner=bowtie2 -d ~/WGBS_Bovine_Brain/reference/

 BS-Seeker2 v2.1.3 - Oct. 25, 2017

Traceback (most recent call last): File "bs_seeker2-build.py", line 56, in if os.path.isfile(os.path.join(os.dbpath, fasta_file)): AttributeError: 'module' object has no attribute 'dbpath'

I have the unzipped file located in ~/WGBS_Bovine_Brain/reference/ and am running this code within the BSseeker2-master folder so that I am running the code in the same location as the bs_seeker2-build.py. I unzipped the reference genome (UMD3.1_chromosomes.fa from UMD3.1_chromosomes.fa.gz, but zcat left the .gz so I just removed it in WinSCP, I am not actually sure if this is a proper way to unzip it and if that is my problem).

Do you know what I am doing wrong? Is there any recommendations for what I need to tell my University to make sure this is uploaded correctly if it is something on their end? They have me load BSseeker2 and Bowtie2 by doing the commands “spack load BSseeker2” and “spack load Bowtie2” if this helps you.

Thank you, Bonnie

From: Weilong Guo [mailto:notifications@github.com] Sent: Thursday, November 9, 2017 10:04 PM To: BSSeeker/BSseeker2 BSseeker2@noreply.github.com Cc: Bonnie Cantrell bacantre@uvm.edu; Author author@noreply.github.com Subject: Re: [BSSeeker/BSseeker2] Building Index WGBS Error (#16)

Hi Bonnie,

Acturally, you need to download BS-Seeker2 from the homepage: https://github.com/BSSeeker/BSseeker2 . There are other scripts in the packages are needed.

For your second question, I suggest you unzip the .fa.gz to .fa file, and then build the genome

Best, Weilong

At 2017-11-10 10:57:01, "bacantre" notifications@github.com<mailto:notifications@github.com> wrote:

Hello,

I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.

I ran the code: python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location

I got the error: Traceback (most recent call last): File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import * ImportError: No module names bs_index.wg_build

I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?

I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.

Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?

Thank you, Bonnie

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/BSSeeker/BSseeker2/issues/16#issuecomment-343361721, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Af-0TTKh_EzFQoBhpIjCkEviM5FjNrpOks5s070HgaJpZM4QZAGV.

guoweilong commented 6 years ago

Thanks for reporting this error message. It was a bug, but rare reported. Now it has been fixed in v2.1.5. And I guess you may not give the right path for genome.fa. -d is to specific where to store the index directory. If you want to specify the path for genome.fa, you can use "-f /genome.fa".

Best, Weilong

bacantre commented 6 years ago

Hello Weilong,

Thank you for all of your help and for fixing this.

I tried to rerun the code again with the new version and got a little farther in the code, but am still receiving errors. This is what I got. It seems like it might be a problem with my reference genome file. Do you know from this error what I am doing wrong? Code below.

python bs_seeker2-build.py -f ~/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa --aligner=bowtie2 -d ~/WGBS_Bovine_Brain/reference/

 BS-Seeker2 v2.1.5 - Dec. 21, 2017

Reference genome file: /users/b/a/bacantre/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa Reduced Representation Bisulfite Sequencing: False Short reads aligner you are using: bowtie2 Builder path: /gpfs1/arch/spack-20170803/opt/spack/linux-rhel7-x86_64/gcc-5.4.0/ bowtie2-2.2.5-ojakqj7byt2zuj6hn2d5ezuopyhudm5d/bin/bowtie2-build [Preprocessing 8fO] Last: 0:00:00.023925 Total: 0:00:00.023955 [Preprocessing ____y_k__0XND____a9vtS__K____Q0_X__] Last: 0:00:00.003813 Total: 0:00:00.027802 [Preprocessing Y2_] Last: 0:00:00.056983 Total: 0:00:00.084818 [Preprocessing __W_iW_47po__] Last: 0:00:00.002809 Total: 0 :00:00.087659 [Preprocessing _7c9____M_a_Yb_c1] Las t: 0:00:00.000844 Total: 0:00:00.088533 [Preprocessing ____Ff] Last: 0:00:00.008730 Total: 0:00:00.097293 [Preprocessing 649_____1__9_] Last: 0:00:00.008953 T otal: 0:00:00.106276 [Preprocessing h] Last: 0:00:00.006022 Total: 0:00:00.112328 [Preprocessing focp_E__2TU_k____3_LZ____bXl_Yl 1LN_8m] Last: 0:00:00.004124 Total: 0:00:00.116483 [Preprocessing _SyMp__L7__E] Last: 0:00:00.003323 T otal: 0:00:00.119865 [Preprocessing K8_GRT0_k__p_UO_Xy] Last: 0:00:00.004 664 Total: 0:00:00.124559 [Preprocessing _l____] Last: 0:00:00.018344 Total: 0:00:00.142935 [Preprocessing __I____H_WDQi_9___CRqI8kBr_g_R__70_ZU __Y____i_cdJNtp5_Eq__T_FaA] Last: 0:00:00.002081 T otal: 0:00:00.145049 [Preprocessing e____CH1__mxX__v__t1__Dm__i 1____iUzcjRL] Last: 0:00:00.015408 Total: 0:00:00.160491 [Preprocessing ] Last: 0:00:00.003249 Total: 0:00:00.163771 [Preprocessing W__1__m__ch9____gV_3_WwIhx__BU_bW_ ___K_n____QSaP_O_Sf_Y_i____4x_J__] Last: 0:00:00. 001747 Total: 0:00:00.165547 [Preprocessing 5R7Y__TEMB_D_nkp____DO_S__Sw_0kO__ _t_] Last: 0:00:00.007278 Total: 0:00:00.172856 [Preprocessing UqA0____2A_Qh_D__C] Last: 0:00:00.030251 Total: 0 :00:00.203164 [Preprocessing H_MJ4T_MkL] Last: 0:00:00.015125 Total: 0:00:00.2 18320 [Preprocessing CkkM____eD__Dj___5___d_pT_i__H_P] Last: 0:0 0:00.007477 Total: 0:00:00.225829 [Preprocessing __B_F_0__pT_l_z_F_qit__F_l5B6Kl_EO____S OM_tL_b___B_B_xaKO0sz] Last: 0:00:00.002442 Total: 0:00:00.2 28302 [Preprocessing R46va__vD____tX] Last: 0:00:00.0167 71 Total: 0:00:00.245107 [Preprocessing __yy____t_x_z_3_____4L__a_LGWf_ 3_gef___W_NUf3_c___3UB__3F___A___r2_gv_ge_9j __W_] Last: 0:00:00.007155 Total: 0:00:00.252294 [Preprocessing ] Last: 0:00:00.003678 Total: 0:00:00.256004 [Preprocessing 1TLI_1] Last: 0:00:00.006726 Total: 0:00:00.262761 ERROR: BS Seeker found identical sequence ids (id: ) in the fasta file: /users/ b/a/bacantre/WGBS_Bovine_Brain/reference/UMD3.1chromosomes.fa. Please, make sure that all sequence ids are unique and contain only alphanumeric characters: A-Za-z0-9

Thank you, Bonnie

From: Weilong Guo [mailto:notifications@github.com] Sent: Thursday, November 9, 2017 10:04 PM To: BSSeeker/BSseeker2 BSseeker2@noreply.github.com Cc: Bonnie Cantrell bacantre@uvm.edu; Author author@noreply.github.com Subject: Re: [BSSeeker/BSseeker2] Building Index WGBS Error (#16)

Hi Bonnie,

Acturally, you need to download BS-Seeker2 from the homepage: https://github.com/BSSeeker/BSseeker2 . There are other scripts in the packages are needed.

For your second question, I suggest you unzip the .fa.gz to .fa file, and then build the genome

Best, Weilong

At 2017-11-10 10:57:01, "bacantre" notifications@github.com<mailto:notifications@github.com> wrote:

Hello,

I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.

I ran the code: python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location

I got the error: Traceback (most recent call last): File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import * ImportError: No module names bs_index.wg_build

I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?

I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.

Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?

Thank you, Bonnie

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/BSSeeker/BSseeker2/issues/16#issuecomment-343361721, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Af-0TTKh_EzFQoBhpIjCkEviM5FjNrpOks5s070HgaJpZM4QZAGV.

guoweilong commented 6 years ago

@bacantre The error message said: Is your input genome file : /users/ b/a/bacantre/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa a TXT file or a binary file?

It should be TXT file.

And do you really have a chromosome or contig name as "8_fO" ? You can double check if you specific right genome file.

Best, Weilong

bacantre commented 6 years ago

Hello Weilong,

Happy holidays. I decided to download a different reference genome format and then re-installed it appropriately. This allowed me to run the code: python bs_seeker2-build.py -f ~/WGBS/reference/referenceUMD3.1.1.fa –aligner=bowtie2 -d ~/WGBS/reference/

I then got a folder named referenceUMD3.1.1.fa_bowtie2 that contains .data files .bt2 files and .log files. This all seemed to work correctly.

When I went to do the alignment, it all went wrong again. I ran the code and got the error below:

python bs_seeker2-align.py -1 ~/WGBS/fastQfiles/D2239Amy_1.fq -2 ~/WGBS/fastQfiles/D2239Amy_2.fq --aligner=bowtie2 -o ~/WGBS/alignment/D2239Amy.bam -f bam -g ~/WGBS/reference/referenceUMD3.1.1.fa

 BS-Seeker2 v2.1.5 - Dec. 21, 2017

ERROR: Index DIR "referenceUMD3.1.1.fa.." cannot be found in /gpfs1/home/b/a/bacantre/WGBS/reference/BSseeker2-master/bs_utils/reference_genomes. Please run the bs_seeker2-build.py to create it with the correct parameters for -g, -r, --low, --up and --aligner.

It seems like it wants the reference_genomes to a directory folder, but it creates this as a file in bs_utils. I have tried to make a directory called reference_genomes, but get an error because a file is already named it. I have also tried moving the index directory (referenceUMD3.1.1.fa_bowtie2) and reference genome to bs_utils directory that I put in WGBS/reference.

I also tried redoing the build below without specifying a -d, but got this error:

python bs_seeker2-build.py -f ~/WGBS/reference/referenceUMD3.1.1.fa --aligner=bowtie2

 BS-Seeker2 v2.1.5 - Dec. 21, 2017

Reference genome file: /users/b/a/bacantre/WGBS_Bovine_Brain/reference/referenceUMD3.1.1.fa Reduced Representation Bisulfite Sequencing: False Short reads aligner you are using: bowtie2 Builder path: /gpfs1/arch/spack-20170803/opt/spack/linux-rhel7-x86_64/gcc-5.4.0/bowtie2-2.2.5-ojakqj7byt2zuj6hn2d5ezuopyhudm5d/bin/bowtie2-build ERROR: /gpfs1/home/b/a/bacantre/WGBS/reference/BSseeker2-master/bs_utils/reference_genomes must be a directory. Please, delete it or change the -d option.

What am I doing wrong? Is there a specific location the builds are suppose to be sent to? Was getting the reference_genomes as a text file in bs_utils correct for running the build code?

Additionally: I originally tried this with also adding a -d ~/WGBS/reference/referenceUMD3.1.1.fa_bowtie2 to indicate the index, but got the same error for reference location either way. I originally thought it was the index file, so I took out using -d

Do I need to reference the index folder? Your examples just use the -g, so I stuck with that to keep it simple.

Thank you, Bonnie

From: Weilong Guo [mailto:notifications@github.com] Sent: Thursday, December 21, 2017 6:24 PM To: BSSeeker/BSseeker2 BSseeker2@noreply.github.com Cc: Bonnie Cantrell bacantre@uvm.edu; Mention mention@noreply.github.com Subject: Re: [BSSeeker/BSseeker2] Building Index WGBS Error (#16)

@bacantrehttps://github.com/bacantre The error message said: Is your input genome file : /users/ b/a/bacantre/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa a TXT file or a binary file?

It should be TXT file.

And do you really have a chromosome or contig name as "8_fO" ? You can double check if you specific right genome file.

Best, Weilong

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/BSSeeker/BSseeker2/issues/16#issuecomment-353482816, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Af-0TdnTqPHLifrC6XQ1K_UhMF-KyK2Pks5tCuiegaJpZM4QZAGV.

guoweilong commented 6 years ago

Hi @bacantre ,

As you built the index using the following command:

python bs_seeker2-build.py -f ~/WGBS/reference/referenceUMD3.1.1.fa –aligner=bowtie2 -d ~/WGBS/reference/

Then you need to specifying the folder in following way

python bs_seeker2-align.py -1 ~/WGBS/fastQfiles/D2239Amy_1.fq -2 ~/WGBS/fastQfiles/D2239Amy_2.fq --aligner=bowtie2 -o ~/WGBS/alignment/D2239Amy.bam -f bam -g referenceUMD3.1.1.fa -d ~/WGBS/reference/

Please note that, for bs_seeker2-align.py, 1) parameter "-g" should specify the genome file name, without the path 2) parameter "-d" should specify the parental directory where you created the index folder, without the index folder name

Let me know if it still not works.

Best， Weilong

justinjohns commented 6 years ago

Related issue -- I've tried 2 genome indexes, once in default directory, and once here:

python bs_seeker2-build.py -f /shafer3/lynx_meth/genome/lynx.fa --aligner=bowtie2 -r -c AT-TAAT,ATGCA-T -d /shafer3/lynx_meth/genome/bs2/

Failed alignment, cannot find directory:

python bs_seeker2-align.py -1 /shafer3/lynx_meth/data/raw_fastq/1_R1.fastq -2 /shafer3/lynx_meth/data/raw_fastq/1_R2.fastq --aligner=bowtie2 -o /shafer3/lynx_meth/data/bs_bam/0001.bam -f bam -g lynx.fa -d /shafer3/lynx_meth/genome/bs2/

 BS-Seeker2 v2.1.3 - Oct. 25, 2017

ERROR: Index DIR "lynx.fa.." cannot be found in /shafer3/lynx_meth/genome/bs2/. Please run the bs_seeker2-build.py to create it with the correct parameters for -g, -r, --low, --up and --aligner.

Contents of the indexed directory: _lynx.fa_rrbs_ATTAAT-ATGCAT_20_500bowtie2 index_directory.txt Head of genome: head_lynx.txt

Bowtie2 warns that Warning: Encountered reference sequence with only gaps, but I have indexed and aligned successfully with Bismark (using Bowtie2), so I don't see why this isn't working. Originally the genome was named ena.fa, but I renamed to lynx.fa, both indexes came up with the same results/

My dd enzymes were AseI / NsiI.

Thanks for any tips! Justin

guoweilong commented 6 years ago

Hi Justin,

As you run the command for building the genome

python bs_seeker2-build.py -f /shafer3/lynx_meth/genome/lynx.fa --aligner=bowtie2 -r -c AT-TAAT,ATGCA-T -d /shafer3/lynx_meth/genome/bs2/

Then you need to also specify parameter "-r" (for RRBS) and "-c" (for your enzymes) for alignment:

python bs_seeker2-align.py -1 /shafer3/lynx_meth/data/raw_fastq/1_R1.fastq -2 /shafer3/lynx_meth/data/raw_fastq/1_R2.fastq --aligner=bowtie2 -o /shafer3/lynx_meth/data/bs_bam/0001.bam -f bam -g lynx.fa -d /shafer3/lynx_meth/genome/bs2/ -r -c AT-TAAT,ATGCA-T

Let me know if it still not works.

Best,

Weilong

At 2018-01-21 23:06:24, "justinjohns" notifications@github.com wrote:

Related issue -- I've tried 2 genome indexes, once in default directory, and once here:

python bs_seeker2-build.py -f /shafer3/lynx_meth/genome/lynx.fa --aligner=bowtie2 -r -c AT-TAAT,ATGCA-T -d /shafer3/lynx_meth/genome/bs2/

Failed alignment, cannot find directory:

`python bs_seeker2-align.py -1 /shafer3/lynx_meth/data/raw_fastq/1_R1.fastq -2 /shafer3/lynx_meth/data/raw_fastq/1_R2.fastq --aligner=bowtie2 -o /shafer3/lynx_meth/data/bs_bam/0001.bam -f bam -g lynx.fa -d /shafer3/lynx_meth/genome/bs2/

BS-Seeker2 v2.1.3 - Oct. 25, 2017

ERROR: Index DIR "lynx.fa.." cannot be found in /shafer3/lynx_meth/genome/bs2/. Please run the bs_seeker2-build.py to create it with the correct parameters for -g, -r, --low, --up and --aligner.`

Contents of the indexed directory: lynx.fa_rrbs_ATTAAT-ATGCAT_20_500_bowtie2 index_directory.txt Head of genome: Uploading head_lynx.txt…

Bowtie2 warns that Warning: Encountered reference sequence with only gaps, but I have indexed and aligned successfully with Bismark (using Bowtie2), so I don't see why this isn't working. Originally the genome was named ena.fa, but I renamed to lynx.fa, both indexes came up with the same results/

My dd enzymes were AseI / NsiI.

Thanks for any tips! Justin

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

christinalrichards commented 5 years ago

I am having a related issue trying to use BSseeker2 to map PE reads of WGBS to the reference tomato genome (AEKE03.fasta). I did not indicate a specific path with the -d option but used the default for the index with this command:

[clr@rra-login1 BSseeker2-master]$ python bs_seeker2-build.py -f AEKE03.fasta --aligner bowtie2

Which resulted in a directory full of .data files:

C_C2T.1.bt2
ENA_AEKE03000623_AEKE03000623.1.data
ENA_AEKE03001259_AEKE03001259.1.data
ENA_AEKE03001895_AEKE03001895.1.data
ENA_AEKE03002531_AEKE03002531.1.data C_C2T.2.bt2
ENA_AEKE03000624_AEKE03000624.1.data
ENA_AEKE03001260_AEKE03001260.1.data
ENA_AEKE03001896_AEKE03001896.1.data ENA_AEKE03002532_AEKE03002532.1.data C_C2T.3.bt2
ENA_AEKE03000625_AEKE03000625.1.data
etc

and then ran for PE conversion to single end mode: [clr@rra-login0 BSseeker2-master]$ python bs_seeker2-align.py -1 10_P_1.fq -2 10_P_2.fq -g AEKE03.fasta -o 10_P.bam -u unmapped

got the error for pysam: [Error] It seems that you haven't install "pysam" package.. Please do it before you run this script.

We installed it with: [clr@rra-login0 BSseeker2-master]$ module load apps/python/2.7.15-el7 [clr@rra-login0 BSseeker2-master]$ pip freeze | grep pysam

then ran: [clr@rra-login0 BSseeker2-master]$ python bs_seeker2-align.py -1 10_P_1.fq -2 10_P_2.fq -g AEKE03.fasta -o 10_P.bam -u unmapped

and got this error:

 BS-Seeker2 v2.1.8 - Oct. 30, 2018

ERROR: Index DIR "AEKE03.fasta.." cannot be found in /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes. Please run the bs_seeker2-build.py to create it with the correct parameters for -g, -r, --low, --up and --aligner.

Does it need -r --low or --up fro WGBS? Or do I need to modify the way that the index is built for PE?

guoweilong commented 5 years ago

As you use "--aligner bowtie2" for buiding the index, you need alto to use "--aligner=bowtie2" for the alignment step. By default, bs_seeker2-align.py will use "bowtie" rather than "bowtie2" for alignment.

Best, Weilong

christinalrichards commented 5 years ago

Thanks so much for your quick response!! It looks like it started with this command:

[clr@rra-login1 BSseeker2-master]$ python bs_seeker2-align.py -1 10_P_1.fq -2 10_P_2.fq -g AEKE03.fasta --aligner=bowtie2 -o 10_P.bam -u unmapped

And ran this far to a new error: OSError: [Errno 2] No such file or directory BS-Seeker2 v2.1.8 - Oct. 30, 2018

[2019-08-23 09:50:15] Mode: Bowtie2, local alignment [2019-08-23 09:50:15] Filter for tag XS: #(mCH)/#(all CH)>50.00% and #(mCH)>5 [2019-08-23 09:50:15] Temporary directory: /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-t9gMui [2019-08-23 09:50:15] Reduced Representation Bisulfite Sequencing: False [2019-08-23 09:50:15] Pair end [2019-08-23 09:50:15] Aligner command: None/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x %(reference_genome)s -f -1 %(input_file_1)s -2 %(input_file_2)s -S %(output_file)s [2019-08-23 09:50:15] ---------------------------------------------- [2019-08-23 09:50:15] Filename for 1st mate: 10_P_1.fq [2019-08-23 09:50:15] Filename for 2nd mate: 10_P_2.fq [2019-08-23 09:50:15] The first base (for mapping): 1 [2019-08-23 09:50:15] The last base (for mapping): 200 [2019-08-23 09:50:15] Path for short reads aligner: None/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x %(reference_genome)s -f -1 %(input_file_1)s -2 %(input_file_2)s -S %(output_file)s

[2019-08-23 09:50:15] Reference genome library path: /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2 [2019-08-23 09:50:15] Directional library [2019-08-23 09:50:15] Number of mismatches allowed: 4 [2019-08-23 09:50:15] -------------------------------- [2019-08-23 09:52:33] Start reading and trimming the input sequences Detected data format: fastq [2019-08-23 09:52:51] Start mapping [2019-08-23 09:52:51] Starting commands: [2019-08-23 09:52:51] Launched: None/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/W_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-t9gMui/Trimed_FCT_1.fa.tmp-6778898 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-t9gMui/Trimed_RGA_2.fa.tmp-6778898 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-t9gMui/W_C2T_fr_m4.mapping.tmp-6778898 [2019-08-23 09:52:51] Launched: None/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/C_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-t9gMui/Trimed_FCT_1.fa.tmp-6778898 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-t9gMui/Trimed_RGA_2.fa.tmp-6778898 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-t9gMui/C_C2T_fr_m4.mapping.tmp-6778898 Traceback (most recent call last): File "bs_seeker2-align.py", line 469, in options.Output_unmapped_hit File "/shares/pi_clr/BSSeeker/BSseeker2-master/bs_align/bs_pair_end.py", line 799, in bs_pair_end 'output_file' : CG2A_fr} ]) File "/shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/utils.py", line 332, in run_in_parallel for i, proc in enumerate([subprocess.Popen(args = shlex.split(cmd), stdout = stdout) for cmd, stdout in commands]): File "/apps/python/2.7.15-el7/lib/python2.7/subprocess.py", line 394, in init errread, errwrite) File "/apps/python/2.7.15-el7/lib/python2.7/subprocess.py", line 1047, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory [clr@rra-login1 BSseeker2-master]$

christinalrichards commented 5 years ago

I may have solved this error by "adding" bowtie2 again since it read (on line 6): [2019-08-23 09:50:15] Aligner command: None/Bowtie2...

now that line says: [2019-08-23 12:50:12] Aligner command: /apps/bowtie/2.3.4.1/bin/bowtie2...

I had run [clr@rra-login1 ~]$ module add apps/bowtie/2.3.4.1 to create the index, but I guess I have to re-add it for each session?

Now its reading: [2019-08-23 12:52:49] Starting commands: [2019-08-23 12:52:49] Launched: /apps/bowtie/2.3.4.1/bin/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/W_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_FCT_1.fa.tmp-8477077 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_RGA_2.fa.tmp-8477077 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/W_C2T_fr_m4.mapping.tmp-8477077 [2019-08-23 12:52:49] Launched: /apps/bowtie/2.3.4.1/bin/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/C_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_FCT_1.fa.tmp-8477077 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_RGA_2.fa.tmp-8477077 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/C_C2T_fr_m4.mapping.tmp-8477077

christinalrichards commented 5 years ago

Hi again!

It went this far but I'm not sure it was finished?

[2019-08-23 12:52:49] Launched: /apps/bowtie/2.3.4.1/bin/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/W_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_FCT_1.fa.tmp-8477077 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_RGA_2.fa.tmp-8477077 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/W_C2T_fr_m4.mapping.tmp-8477077 [2019-08-23 12:52:49] Launched: /apps/bowtie/2.3.4.1/bin/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/C_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_FCT_1.fa.tmp-8477077 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_RGA_2.fa.tmp-8477077 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/C_C2T_fr_m4.mapping.tmp-8477077 [2019-08-23 13:00:20] Finished: /apps/bowtie/2.3.4.1/bin/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/W_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_FCT_1.fa.tmp-8477077 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_RGA_2.fa.tmp-8477077 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/W_C2T_fr_m4.mapping.tmp-8477077 [2019-08-23 13:00:20] Finished: /apps/bowtie/2.3.4.1/bin/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/C_C2T -f -1 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_FCT_1.fa.tmp-8477077 -2 /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/Trimed_RGA_2.fa.tmp-8477077 -S /tmp/bs_seeker2_10P.bam-bowtie2-local-TMP-a1eCZi/C_C2T_fr_m4.mapping.tmp-8477077

guoweilong commented 5 years ago

Hi @christinalrichards ,

Sorry for the late reply, as I might have missed this message in email.

It takes some time to run if you have lots of data. Here are some suggestions for improving the performance: https://github.com/BSSeeker/BSseeker2#1-performance

Best, Weilong

BSSeeker / BSseeker2

Building Index WGBS Error #16

Best,