biocore / deblur

Deblur is a greedy deconvolution algorithm based on known read error profiles.
BSD 3-Clause "New" or "Revised" License
92 stars 41 forks source link

Installation hiccup: pytz>= distribution not found, required by pandas #163

Closed kevinmcc21 closed 6 years ago

kevinmcc21 commented 6 years ago

After following installation instructions, I kept getting this error when trying to run deblur:

pkg_resources.DistributionNotFound: The 'pytz>=2011k' distribution was not found and is required by pandas

Running "pip install --pre pytz" seems to have fixed the problem.

wasade commented 6 years ago

Did you install deblur using conda?

kevinmcc21 commented 6 years ago

Yep, followed these command line instructions:

conda create -n deblurenv python=3.5 numpy source activate deblurenv conda install -c bioconda -c biocore VSEARCH MAFFT=7.310 biom-format SortMeRNA==2.0 deblur

On Fri, Dec 1, 2017 at 6:37 PM, Daniel McDonald notifications@github.com wrote:

Did you install deblur using conda?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-348642916, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxYLOE8rRdQRh3aM2nm9JtW0WJdeSks5s8I3LgaJpZM4Qy_gF .

wasade commented 6 years ago

When I issue a similar command, I see that pytz: 2017.2-py36_0 is included in the packages to be installed. Is that not listed for you?

kevinmcc21 commented 6 years ago

It is listed and when I tried to install pytz explicitly, I got a message saying it was already installed. So it being installed was not the problem. Running the command line in my solution did fix it though.

On Sat, Dec 2, 2017 at 4:35 PM Daniel McDonald notifications@github.com wrote:

When I issue a similar command, I see that pytz: 2017.2-py36_0 is included in the packages to be installed. Is that not listed for you?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-348721867, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxQ0rzi_MaQGCSUIsEXhEATcbw0q3ks5s8cKNgaJpZM4Qy_gF .

wasade commented 6 years ago

This seems like an upstream issue. If you just do conda install pandas, does the pytz error occur? If yes, would you consider opening an issue on their tracker?

kevinmcc21 commented 6 years ago

No, it does not. It says "All requested packages already installed."

Unrelated, I got a lot of warnings when running the script (looks like one warning per library) from line 849 here: https://github.com/biocore/deblur/blob/master/deblur/workflow.py

"Problem removing artifacts from file" and then it shows the destination file under a split/ directory in my specified output folder. There is no split/ director in my specified output folder, so I'm guessing that is the problem, that the "split" subfolder was never created.

On Sat, Dec 2, 2017 at 6:40 PM, Daniel McDonald notifications@github.com wrote:

This seems like an upstream issue. If you just do conda install pandas, does the pytz error occur? If yes, would you consider opening an issue on their tracker?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-348728585, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxU9-O40QgrFnMJAyXFJw2v2HEDCVks5s8d_1gaJpZM4Qy_gF .

wasade commented 6 years ago

Would it be possible to test the conda install pandas in an independent environment? I think the message you received is due to executing it in an environment where pandas is already installed. We do not maintain pandas, or use pytz directly (or indirectly); the error being reported appears to be due to pandas, so I'm just trying to help isolate where it is to better direct how to come to a resolution.

For the warnings, can you add --keep-tmp-files to the command? @amnona, do you have any idea regarding the warning message?

kevinmcc21 commented 6 years ago

I created a new environment and ran conda install pandas before installing deblur but I got the same message: Traceback (most recent call last): File "/home/kevin/anaconda3/envs/testenv/bin/deblur", line 4, in import('pkg_resources').run_script('deblur==1.0.3', 'deblur') File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site-packages/pkg_resources/init.py", line 3144, in @_call_aside File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site-packages/pkg_resources/init.py", line 3128, in _call_aside f(*args, **kwargs) File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site-packages/pkg_resources/init.py", line 3157, in _initialize_master_working_set working_set = WorkingSet._build_master() File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site-packages/pkg_resources/init.py", line 666, in _build_master ws.require(requires) File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site-packages/pkg_resources/init.py", line 984, in require needed = self.resolve(parse_requirements(requirements)) File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site-packages/pkg_resources/init.py", line 870, in resolve raise DistributionNotFound(req, requirers) pkg_resources.DistributionNotFound: The 'pytz>=2011k' distribution was not found and is required by pandas

Running the "pip install --pre pytz" command seems to have again fixed it. I am running deblur now with the --keep-tmp-files parameter.

On Mon, Dec 4, 2017 at 12:06 PM, Daniel McDonald notifications@github.com wrote:

Would it be possible to test the conda install pandas in an independent environment? I think the message you received is due to executing it in an environment where pandas is already installed. We do not maintain pandas, or use pytz directly (or indirectly); the error being reported appears to be due to pandas, so I'm just trying to help isolate where it is to better direct how to come to a resolution.

For the warnings, can you add --keep-tmp-files to the command? @amnona https://github.com/amnona, do you have any idea regarding the warning message?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349029896, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxTC4clkzXx-hlJcOxheZBYBhBW-1ks5s9CafgaJpZM4Qy_gF .

kevinmcc21 commented 6 years ago

I have finished running the command with --keep-tmp-files; the directories split/ and deblur_working_dir/ are both populated with files. I still got the warning messages that the files in the split/ directory could not have artifacts removed. All other files are the same size as they were in the original run (without --keep-tmp-files).

The resulting all.biom is 16M. Does this seem to be around the size you would expect? There were 492 libraries in the input file, with a total of ~32M reads.

On Mon, Dec 4, 2017 at 2:43 PM, Kevin McCormick < kevinmcc@pennmedicine.upenn.edu> wrote:

I created a new environment and ran conda install pandas before installing deblur but I got the same message: Traceback (most recent call last): File "/home/kevin/anaconda3/envs/testenv/bin/deblur", line 4, in

__import__('pkg_resources').run_script('deblur==1.0.3', 'deblur') File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site- packages/pkg_resources/__init__.py", line 3144, in @_call_aside File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site- packages/pkg_resources/__init__.py", line 3128, in _call_aside f(*args, **kwargs) File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site- packages/pkg_resources/__init__.py", line 3157, in _initialize_master_working_set working_set = WorkingSet._build_master() File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site- packages/pkg_resources/__init__.py", line 666, in _build_master ws.require(__requires__) File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site- packages/pkg_resources/__init__.py", line 984, in require needed = self.resolve(parse_requirements(requirements)) File "/home/kevin/anaconda3/envs/testenv/lib/python3.5/site- packages/pkg_resources/__init__.py", line 870, in resolve raise DistributionNotFound(req, requirers) pkg_resources.DistributionNotFound: The 'pytz>=2011k' distribution was not found and is required by pandas Running the "pip install --pre pytz" command seems to have again fixed it. I am running deblur now with the --keep-tmp-files parameter. On Mon, Dec 4, 2017 at 12:06 PM, Daniel McDonald wrote: > Would it be possible to test the conda install pandas in an independent > environment? I think the message you received is due to executing it in an > environment where pandas is already installed. We do not maintain pandas, > or use pytz directly (or indirectly); the error being reported appears to > be due to pandas, so I'm just trying to help isolate where it is to better > direct how to come to a resolution. > > For the warnings, can you add --keep-tmp-files to the command? @amnona > , do you have any idea regarding the warning > message? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > , > or mute the thread > > . >
wasade commented 6 years ago

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

kevinmcc21 commented 6 years ago

No worries at all.

Yes, this is 16S data.

On Wed, Dec 6, 2017 at 12:38 PM, Daniel McDonald notifications@github.com wrote:

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349717225, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxSfFPQdajAxcXYKirtDaGLQNEth4ks5s9tERgaJpZM4Qy_gF .

amnona commented 6 years ago

Hi, in order to understand the "Problem removing artifacts from file" warning, could you rerun with debug level set to highest details (using the flags: --log-level 1 --log-file debuglog.txt --keep-tmp-files) and then look/send the output in debuglog.txt? My guess is there is a problem with running sortmerna, but best to look at the detailed log file and see :)

Thanks Amnon

On Wed, Dec 6, 2017 at 8:31 PM, kevinmcc21 notifications@github.com wrote:

No worries at all.

Yes, this is 16S data.

On Wed, Dec 6, 2017 at 12:38 PM, Daniel McDonald <notifications@github.com

wrote:

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349717225, or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxSfFPQdajAxcXYKirtDaGLQNEth4ks5s9tERgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349732078, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8kzAgmG9Y1COAFpX76VOyIWcXKMKks5s9t1qgaJpZM4Qy_gF .

kevinmcc21 commented 6 years ago

No problem, I'll run it right away.

I had an additional question regarding the results. The total read counts in the OTU table produced by deblur seem to only include ~50% of our reads. After removing bad libraries, our previous total read count was ~27M; applying deblur reduces this to ~13M. Does this seem consistent with intended behavior? Our (prior) understanding was that deblur groups reads together, by identifying likely sequencing errors, but we did not think that it actually removes reads from downstream analysis.

I'll take another look at the paper to further my understanding as well. Thanks very much for the help!

On Wed, Dec 6, 2017 at 2:56 PM, amnona notifications@github.com wrote:

Hi, in order to understand the "Problem removing artifacts from file" warning, could you rerun with debug level set to highest details (using the flags: --log-level 1 --log-file debuglog.txt --keep-tmp-files) and then look/send the output in debuglog.txt? My guess is there is a problem with running sortmerna, but best to look at the detailed log file and see :)

Thanks Amnon

On Wed, Dec 6, 2017 at 8:31 PM, kevinmcc21 notifications@github.com wrote:

No worries at all.

Yes, this is 16S data.

On Wed, Dec 6, 2017 at 12:38 PM, Daniel McDonald < notifications@github.com

wrote:

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349717225, or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxSfFPQdajAxcXYKirtDaGLQNEth4ks5s9tERgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349732078, or mute the thread https://github.com/notifications/unsubscribe-auth/ AFkA8kzAgmG9Y1COAFpX76VOyIWcXKMKks5s9t1qgaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349756135, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxU5pyNO08VOJswKo90qUJNs0jDAtks5s9vFQgaJpZM4Qy_gF .

amnona commented 6 years ago

Cool. Deblur does not try to correct reads containing errors but rather throws them away. For error rate 0.005 per nucleotide and read length 150 you expect (1-0.005)^150=~0.5, so approx. Half of the reads will be error free and 0.5 will be thrown away. So your number sounds reasonable. Does this make sense?

On Dec 6, 2017 10:54 PM, "kevinmcc21" notifications@github.com wrote:

No problem, I'll run it right away.

I had an additional question regarding the results. The total read counts in the OTU table produced by deblur seem to only include ~50% of our reads. After removing bad libraries, our previous total read count was ~27M; applying deblur reduces this to ~13M. Does this seem consistent with intended behavior? Our (prior) understanding was that deblur groups reads together, by identifying likely sequencing errors, but we did not think that it actually removes reads from downstream analysis.

I'll take another look at the paper to further my understanding as well. Thanks very much for the help!

On Wed, Dec 6, 2017 at 2:56 PM, amnona notifications@github.com wrote:

Hi, in order to understand the "Problem removing artifacts from file" warning, could you rerun with debug level set to highest details (using the flags: --log-level 1 --log-file debuglog.txt --keep-tmp-files) and then look/send the output in debuglog.txt? My guess is there is a problem with running sortmerna, but best to look at the detailed log file and see :)

Thanks Amnon

On Wed, Dec 6, 2017 at 8:31 PM, kevinmcc21 notifications@github.com wrote:

No worries at all.

Yes, this is 16S data.

On Wed, Dec 6, 2017 at 12:38 PM, Daniel McDonald < notifications@github.com

wrote:

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/163#issuecomment-349717225 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxSfFPQdajAxcXYKirtDaGLQNEth4ks5s9tERgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349732078, or mute the thread https://github.com/notifications/unsubscribe-auth/ AFkA8kzAgmG9Y1COAFpX76VOyIWcXKMKks5s9t1qgaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349756135, or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxU5pyNO08VOJswKo90qUJNs0jDAtks5s9vFQgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349771614, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8ijdkT0LTzNv_8s9Mn2EB3ksuyUCks5s9v8ggaJpZM4Qy_gF .

kevinmcc21 commented 6 years ago

Hm, this does make some sense. But how does it know which reads have errors?

Also, I see that by default the script removes reads with < 10 copies. So that might be accounting for many of the removed reads.

On Wed, Dec 6, 2017 at 4:06 PM, amnona notifications@github.com wrote:

Cool. Deblur does not try to correct reads containing errors but rather throws them away. For error rate 0.005 per nucleotide and read length 150 you expect (1-0.005)^150=~0.5, so approx. Half of the reads will be error free and 0.5 will be thrown away. So your number sounds reasonable. Does this make sense?

On Dec 6, 2017 10:54 PM, "kevinmcc21" notifications@github.com wrote:

No problem, I'll run it right away.

I had an additional question regarding the results. The total read counts in the OTU table produced by deblur seem to only include ~50% of our reads. After removing bad libraries, our previous total read count was ~27M; applying deblur reduces this to ~13M. Does this seem consistent with intended behavior? Our (prior) understanding was that deblur groups reads together, by identifying likely sequencing errors, but we did not think that it actually removes reads from downstream analysis.

I'll take another look at the paper to further my understanding as well. Thanks very much for the help!

On Wed, Dec 6, 2017 at 2:56 PM, amnona notifications@github.com wrote:

Hi, in order to understand the "Problem removing artifacts from file" warning, could you rerun with debug level set to highest details (using the flags: --log-level 1 --log-file debuglog.txt --keep-tmp-files) and then look/send the output in debuglog.txt? My guess is there is a problem with running sortmerna, but best to look at the detailed log file and see :)

Thanks Amnon

On Wed, Dec 6, 2017 at 8:31 PM, kevinmcc21 notifications@github.com wrote:

No worries at all.

Yes, this is 16S data.

On Wed, Dec 6, 2017 at 12:38 PM, Daniel McDonald < notifications@github.com

wrote:

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/163# issuecomment-349717225 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxSfFPQdajAxcXYKirtDaGLQNEth4ks5s9tERgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/163#issuecomment-349732078 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AFkA8kzAgmG9Y1COAFpX76VOyIWcXKMKks5s9t1qgaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349756135, or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxU5pyNO08VOJswKo90qUJNs0jDAtks5s9vFQgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349771614, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8ijdkT0LTzNv_ 8s9Mn2EB3ksuyUCks5s9v8ggaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349774674, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxcnkOVbEj0wdIVmyuv9faAZzSk8Pks5s9wHdgaJpZM4Qy_gF .

kevinmcc21 commented 6 years ago

The debuglog.txt file is attached.

On Wed, Dec 6, 2017 at 4:39 PM, Kevin McCormick < kevinmcc@pennmedicine.upenn.edu> wrote:

Hm, this does make some sense. But how does it know which reads have errors?

Also, I see that by default the script removes reads with < 10 copies. So that might be accounting for many of the removed reads.

On Wed, Dec 6, 2017 at 4:06 PM, amnona notifications@github.com wrote:

Cool. Deblur does not try to correct reads containing errors but rather throws them away. For error rate 0.005 per nucleotide and read length 150 you expect (1-0.005)^150=~0.5, so approx. Half of the reads will be error free and 0.5 will be thrown away. So your number sounds reasonable. Does this make sense?

On Dec 6, 2017 10:54 PM, "kevinmcc21" notifications@github.com wrote:

No problem, I'll run it right away.

I had an additional question regarding the results. The total read counts in the OTU table produced by deblur seem to only include ~50% of our reads. After removing bad libraries, our previous total read count was ~27M; applying deblur reduces this to ~13M. Does this seem consistent with intended behavior? Our (prior) understanding was that deblur groups reads together, by identifying likely sequencing errors, but we did not think that it actually removes reads from downstream analysis.

I'll take another look at the paper to further my understanding as well. Thanks very much for the help!

On Wed, Dec 6, 2017 at 2:56 PM, amnona notifications@github.com wrote:

Hi, in order to understand the "Problem removing artifacts from file" warning, could you rerun with debug level set to highest details (using the flags: --log-level 1 --log-file debuglog.txt --keep-tmp-files) and then look/send the output in debuglog.txt? My guess is there is a problem with running sortmerna, but best to look at the detailed log file and see :)

Thanks Amnon

On Wed, Dec 6, 2017 at 8:31 PM, kevinmcc21 notifications@github.com wrote:

No worries at all.

Yes, this is 16S data.

On Wed, Dec 6, 2017 at 12:38 PM, Daniel McDonald < notifications@github.com

wrote:

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/163#issuecomment- 349717225 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxSfFPQdajAxcXYKirtDaGLQNEth4ks5s9tERgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment- 349732078, or mute the thread https://github.com/notifications/unsubscribe-auth/ AFkA8kzAgmG9Y1COAFpX76VOyIWcXKMKks5s9t1qgaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/163#issuecomment-349756135 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxU5pyNO08VOJswKo90qUJNs0jDAtks5s9vFQgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349771614, or mute the thread https://github.com/notifications/unsubscribe-auth/ AFkA8ijdkT0LTzNv_8s9Mn2EB3ksuyUCks5s9v8ggaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349774674, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxcnkOVbEj0wdIVmyuv9faAZzSk8Pks5s9wHdgaJpZM4Qy_gF .

INFO(140151359256320)2017-12-06 15:57:59,141:* INFO(140151359256320)2017-12-06 15:57:59,141:deblurring started WARNING(140151359256320)2017-12-06 15:57:59,141:deblur version 1.0.3 workflow started on /home/kevin/projects/islandGut/library/seqs.fna WARNING(140151359256320)2017-12-06 15:57:59,142:parameters: {'is_worker_thread': None, 'threads_per_sample': 1, 'left_trim_length': 0, 'logger': <logging.Logger object at 0x7f770648b5c0>, 'pos_ref_fp': (), 'neg_ref_fp': (), 'log_level': 1, 'jobs_to_start': 1, 'overwrite': None, 'min_reads': 10, 'error_dist': [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005], 'log_file': '/home/kevin/projects/islandGut/debuglog.txt', 'pos_ref_db_fp': (), 'keep_tmp_files': True, 'indel_max': 3, 'seqs_fp': '/home/kevin/projects/islandGut/library/seqs.fna', 'min_size': 2, 'output_dir': '/home/kevin/projects/islandGut/deblur2', 'indel_prob': 0.01, 'mean_error': 0.005, 'neg_ref_db_fp': (), 'trim_length': 250} INFO(140151359256320)2017-12-06 15:57:59,142:error_dist is : [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] DEBUG(140151359256320)2017-12-06 15:57:59,142:Using default positive filtering file ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta'] DEBUG(140151359256320)2017-12-06 15:57:59,142:Using default negative filtering file ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa'] INFO(140151359256320)2017-12-06 15:57:59,142:deblur main program started INFO(140151359256320)2017-12-06 15:57:59,142:splitting file /home/kevin/projects/islandGut/library/seqs.fna to per sample fasta. output /home/kevin/projects/islandGut/deblur2/split INFO(140151359256320)2017-12-06 15:57:59,142:split_sequence_file_on_sample_ids_to_files for file <_io.TextIOWrapper name='/home/kevin/projects/islandGut/library/seqs.fna' mode='U' encoding='UTF-8'> into dir /home/kevin/projects/islandGut/deblur2/split INFO(140151359256320)2017-12-06 16:11:08,771:split to 488 files INFO(140151359256320)2017-12-06 16:11:08,797:building negative db sortmerna index files INFO(140151359256320)2017-12-06 16:11:08,797:build_index_sortmerna files ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa'] to dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140151359256320)2017-12-06 16:11:08,797:processing file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa into location /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts DEBUG(140151359256320)2017-12-06 16:11:08,797:system call: ['indexdb_rna', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--tmpdir', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir'] DEBUG(140151359256320)2017-12-06 16:11:09,109:file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa indexed INFO(140151359256320)2017-12-06 16:11:09,109:building positive db sortmerna index files INFO(140151359256320)2017-12-06 16:11:09,109:build_index_sortmerna files ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta'] to dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140151359256320)2017-12-06 16:11:09,109:processing file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta into location /home/kevin/projects/islandGut/deblur2/deblur_working_dir/88_otus DEBUG(140151359256320)2017-12-06 16:11:09,110:system call: ['indexdb_rna', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/88_otus', '--tmpdir', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir'] DEBUG(140151359256320)2017-12-06 16:12:04,896:file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta indexed INFO(140151359256320)2017-12-06 16:12:04,896:processing per sample fasta files INFO(140151359256320)2017-12-06 16:12:04,897:-------------------------------------------------------- INFO(140151359256320)2017-12-06 16:12:04,897:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.BP2.Feces.1.1.fasta DEBUG(140151359256320)2017-12-06 16:12:28,411:trimmed to length 250 (1091492 / 1091519 remaining) INFO(140151359256320)2017-12-06 16:12:28,411:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim DEBUG(140151359256320)2017-12-06 16:12:28,411:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140151359256320)2017-12-06 16:12:36,711:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 16:12:36,711:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 16:12:36,711:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140151359256320)2017-12-06 16:12:43,881:total sequences 43321, passing sequences 41702, failing sequences 1619 INFO(140151359256320)2017-12-06 16:12:43,882:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140151359256320)2017-12-06 16:12:43,882:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140151359256320)2017-12-06 17:02:18,368:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140151359256320)2017-12-06 17:02:25,746:deblurring 41702 sequences INFO(140151359256320)2017-12-06 17:07:24,407:2637 unique sequences left following deblurring INFO(140151359256320)2017-12-06 17:07:24,486:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140151359256320)2017-12-06 17:07:24,486:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140151359256320)2017-12-06 17:07:27,142:finished processing file INFO(140151359256320)2017-12-06 17:07:27,155:-------------------------------------------------------- INFO(140151359256320)2017-12-06 17:07:27,155:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.AK3.Feces.1.1.fasta DEBUG(140151359256320)2017-12-06 17:07:51,623:trimmed to length 250 (990996 / 991040 remaining) INFO(140151359256320)2017-12-06 17:07:51,624:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim DEBUG(140151359256320)2017-12-06 17:07:51,624:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140151359256320)2017-12-06 17:07:55,045:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 17:07:55,045:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 17:07:55,045:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140151359256320)2017-12-06 17:07:58,394:total sequences 42440, passing sequences 41427, failing sequences 1013 INFO(140151359256320)2017-12-06 17:07:58,394:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140151359256320)2017-12-06 17:07:58,395:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140151359256320)2017-12-06 17:53:00,970:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140151359256320)2017-12-06 17:53:08,030:deblurring 41427 sequences INFO(140151359256320)2017-12-06 18:01:47,845:4501 unique sequences left following deblurring INFO(140151359256320)2017-12-06 18:01:47,972:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140151359256320)2017-12-06 18:01:47,972:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140151359256320)2017-12-06 18:01:53,311:finished processing file INFO(140151359256320)2017-12-06 18:01:53,327:-------------------------------------------------------- INFO(140151359256320)2017-12-06 18:01:53,327:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.AK2.Feces.1.1.fasta DEBUG(140151359256320)2017-12-06 18:02:15,640:trimmed to length 250 (811120 / 811160 remaining) INFO(140151359256320)2017-12-06 18:02:15,640:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim DEBUG(140151359256320)2017-12-06 18:02:15,640:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140151359256320)2017-12-06 18:02:18,510:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 18:02:18,510:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 18:02:18,510:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140151359256320)2017-12-06 18:02:20,748:total sequences 23733, passing sequences 22640, failing sequences 1093 INFO(140151359256320)2017-12-06 18:02:20,748:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140151359256320)2017-12-06 18:02:20,748:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140151359256320)2017-12-06 18:14:54,745:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140151359256320)2017-12-06 18:14:58,605:deblurring 22640 sequences INFO(140151359256320)2017-12-06 18:17:48,648:2697 unique sequences left following deblurring INFO(140151359256320)2017-12-06 18:17:48,714:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140151359256320)2017-12-06 18:17:48,714:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140151359256320)2017-12-06 18:17:51,681:finished processing file INFO(140151359256320)2017-12-06 18:17:51,690:-------------------------------------------------------- INFO(140151359256320)2017-12-06 18:17:51,691:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.EG3.Feces.1.1.fasta DEBUG(140151359256320)2017-12-06 18:18:55,188:trimmed to length 250 (2712646 / 2712789 remaining) INFO(140151359256320)2017-12-06 18:18:55,188:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim DEBUG(140151359256320)2017-12-06 18:18:55,188:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140151359256320)2017-12-06 18:19:04,539:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 18:19:04,539:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep DEBUG(140151359256320)2017-12-06 18:19:04,539:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140151359256320)2017-12-06 18:19:10,414:total sequences 83940, passing sequences 79625, failing sequences 4315 INFO(140151359256320)2017-12-06 18:19:10,414:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140151359256320)2017-12-06 18:19:10,415:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140151359256320)2017-12-06 19:17:55,077:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140151359256320)2017-12-06 19:18:14,503:deblurring 79625 sequences INFO(140151359256320)2017-12-06 19:53:18,632:8850 unique sequences left following deblurring INFO(140372259038976)2017-12-07 11:11:09,749:* INFO(140372259038976)2017-12-07 11:11:09,749:deblurring started WARNING(140372259038976)2017-12-07 11:11:09,749:deblur version 1.0.3 workflow started on /home/kevin/projects/islandGut/library/seqs.fna WARNING(140372259038976)2017-12-07 11:11:09,749:parameters: {'jobs_to_start': 1, 'trim_length': 250, 'min_size': 2, 'error_dist': [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005], 'pos_ref_db_fp': (), 'is_worker_thread': None, 'left_trim_length': 0, 'indel_prob': 0.01, 'log_level': 1, 'min_reads': 10, 'neg_ref_db_fp': (), 'keep_tmp_files': True, 'threads_per_sample': 1, 'output_dir': '/home/kevin/projects/islandGut/deblur2', 'indel_max': 3, 'neg_ref_fp': (), 'logger': <logging.Logger object at 0x7faa74f3b6a0>, 'pos_ref_fp': (), 'seqs_fp': '/home/kevin/projects/islandGut/library/seqs.fna', 'overwrite': None, 'log_file': '/home/kevin/projects/islandGut/debuglog.txt', 'mean_error': 0.005} INFO(140372259038976)2017-12-07 11:11:09,749:error_dist is : [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] DEBUG(140372259038976)2017-12-07 11:11:09,749:Using default positive filtering file ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta'] DEBUG(140372259038976)2017-12-07 11:11:09,749:Using default negative filtering file ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa'] INFO(140372259038976)2017-12-07 11:11:09,749:deblur main program started CRITICAL(140372259038976)2017-12-07 11:11:09,749:output directory /home/kevin/projects/islandGut/deblur2 already exists INFO(140679197693696)2017-12-07 11:52:54,191:***** INFO(140679197693696)2017-12-07 11:52:54,191:deblurring started WARNING(140679197693696)2017-12-07 11:52:54,191:deblur version 1.0.3 workflow started on /home/kevin/projects/islandGut/library/seqs.fna WARNING(140679197693696)2017-12-07 11:52:54,191:parameters: {'left_trim_length': 0, 'indel_max': 3, 'pos_ref_fp': (), 'is_worker_thread': None, 'min_reads': 10, 'log_file': '/home/kevin/projects/islandGut/debuglog.txt', 'min_size': 2, 'neg_ref_db_fp': (), 'trim_length': 250, 'log_level': 1, 'indel_prob': 0.01, 'threads_per_sample': 1, 'keep_tmp_files': True, 'logger': <logging.Logger object at 0x7ff1ebe7c4a8>, 'jobs_to_start': 1, 'error_dist': [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005], 'mean_error': 0.005, 'output_dir': '/home/kevin/projects/islandGut/deblur2', 'neg_ref_fp': (), 'overwrite': None, 'seqs_fp': '/home/kevin/projects/islandGut/library/seqs.fna', 'pos_ref_db_fp': ()} INFO(140679197693696)2017-12-07 11:52:54,191:error_dist is : [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] DEBUG(140679197693696)2017-12-07 11:52:54,192:Using default positive filtering file ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta'] DEBUG(140679197693696)2017-12-07 11:52:54,192:Using default negative filtering file ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa'] INFO(140679197693696)2017-12-07 11:52:54,192:deblur main program started INFO(140679197693696)2017-12-07 11:52:54,192:splitting file /home/kevin/projects/islandGut/library/seqs.fna to per sample fasta. output /home/kevin/projects/islandGut/deblur2/split INFO(140679197693696)2017-12-07 11:52:54,192:split_sequence_file_on_sample_ids_to_files for file <_io.TextIOWrapper name='/home/kevin/projects/islandGut/library/seqs.fna' mode='U' encoding='UTF-8'> into dir /home/kevin/projects/islandGut/deblur2/split INFO(140679197693696)2017-12-07 12:04:06,382:split to 488 files INFO(140679197693696)2017-12-07 12:04:06,397:building negative db sortmerna index files INFO(140679197693696)2017-12-07 12:04:06,397:build_index_sortmerna files ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa'] to dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 12:04:06,397:processing file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa into location /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts DEBUG(140679197693696)2017-12-07 12:04:06,397:system call: ['indexdb_rna', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--tmpdir', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir'] DEBUG(140679197693696)2017-12-07 12:04:06,467:file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa indexed INFO(140679197693696)2017-12-07 12:04:06,467:building positive db sortmerna index files INFO(140679197693696)2017-12-07 12:04:06,467:build_index_sortmerna files ['/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta'] to dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 12:04:06,467:processing file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta into location /home/kevin/projects/islandGut/deblur2/deblur_working_dir/88_otus DEBUG(140679197693696)2017-12-07 12:04:06,467:system call: ['indexdb_rna', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/88_otus', '--tmpdir', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir'] DEBUG(140679197693696)2017-12-07 12:04:56,770:file /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/88_otus.fasta indexed INFO(140679197693696)2017-12-07 12:04:56,770:processing per sample fasta files INFO(140679197693696)2017-12-07 12:04:56,770:-------------------------------------------------------- INFO(140679197693696)2017-12-07 12:04:56,770:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.BP2.Feces.1.1.fasta DEBUG(140679197693696)2017-12-07 12:05:17,537:trimmed to length 250 (1091492 / 1091519 remaining) INFO(140679197693696)2017-12-07 12:05:17,537:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 12:05:17,537:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 12:05:21,113:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 12:05:21,114:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 12:05:21,114:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 12:05:24,123:total sequences 43321, passing sequences 41702, failing sequences 1619 INFO(140679197693696)2017-12-07 12:05:24,123:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 12:05:24,124:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 12:52:30,367:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 12:52:36,696:deblurring 41702 sequences INFO(140679197693696)2017-12-07 12:56:52,268:2637 unique sequences left following deblurring INFO(140679197693696)2017-12-07 12:56:52,348:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 12:56:52,348:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.BP2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 12:56:54,874:finished processing file INFO(140679197693696)2017-12-07 12:56:54,884:-------------------------------------------------------- INFO(140679197693696)2017-12-07 12:56:54,884:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.AK3.Feces.1.1.fasta DEBUG(140679197693696)2017-12-07 12:57:13,898:trimmed to length 250 (990996 / 991040 remaining) INFO(140679197693696)2017-12-07 12:57:13,898:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 12:57:13,898:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 12:57:16,322:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 12:57:16,323:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 12:57:16,323:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 12:57:19,157:total sequences 42440, passing sequences 41427, failing sequences 1013 INFO(140679197693696)2017-12-07 12:57:19,157:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 12:57:19,157:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 13:39:13,142:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 13:39:20,067:deblurring 41427 sequences INFO(140679197693696)2017-12-07 13:47:33,769:4501 unique sequences left following deblurring INFO(140679197693696)2017-12-07 13:47:33,907:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 13:47:33,907:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 13:47:38,841:finished processing file INFO(140679197693696)2017-12-07 13:47:38,856:-------------------------------------------------------- INFO(140679197693696)2017-12-07 13:47:38,856:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.AK2.Feces.1.1.fasta DEBUG(140679197693696)2017-12-07 13:47:59,032:trimmed to length 250 (811120 / 811160 remaining) INFO(140679197693696)2017-12-07 13:47:59,032:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 13:47:59,032:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 13:48:00,784:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 13:48:00,785:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 13:48:00,785:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 13:48:02,955:total sequences 23733, passing sequences 22640, failing sequences 1093 INFO(140679197693696)2017-12-07 13:48:02,955:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 13:48:02,955:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 14:00:51,071:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 14:00:55,079:deblurring 22640 sequences INFO(140679197693696)2017-12-07 14:03:38,276:2697 unique sequences left following deblurring INFO(140679197693696)2017-12-07 14:03:38,346:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 14:03:38,346:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK2.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 14:03:40,865:finished processing file INFO(140679197693696)2017-12-07 14:03:40,875:-------------------------------------------------------- INFO(140679197693696)2017-12-07 14:03:40,875:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.EG3.Feces.1.1.fasta DEBUG(140679197693696)2017-12-07 14:04:42,200:trimmed to length 250 (2712646 / 2712789 remaining) INFO(140679197693696)2017-12-07 14:04:42,200:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 14:04:42,200:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 14:04:47,865:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 14:04:47,866:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 14:04:47,866:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 14:04:54,047:total sequences 83940, passing sequences 79625, failing sequences 4315 INFO(140679197693696)2017-12-07 14:04:54,047:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 14:04:54,048:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 14:59:17,064:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 14:59:35,799:deblurring 79625 sequences INFO(140679197693696)2017-12-07 15:33:13,200:8850 unique sequences left following deblurring INFO(140679197693696)2017-12-07 15:33:13,527:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 15:33:13,527:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG3.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 15:33:23,532:finished processing file INFO(140679197693696)2017-12-07 15:33:23,563:-------------------------------------------------------- INFO(140679197693696)2017-12-07 15:33:23,564:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.EG4.Feces.1.1.fasta DEBUG(140679197693696)2017-12-07 15:34:04,494:trimmed to length 250 (1901969 / 1902080 remaining) INFO(140679197693696)2017-12-07 15:34:04,494:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 15:34:04,494:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 15:34:11,070:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 15:34:11,070:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 15:34:11,070:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 15:34:16,257:total sequences 66692, passing sequences 64022, failing sequences 2670 INFO(140679197693696)2017-12-07 15:34:16,257:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 15:34:16,257:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 16:10:37,777:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 16:10:51,218:deblurring 64022 sequences INFO(140679197693696)2017-12-07 16:37:10,492:9522 unique sequences left following deblurring INFO(140679197693696)2017-12-07 16:37:10,779:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 16:37:10,779:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG4.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 16:37:21,584:finished processing file INFO(140679197693696)2017-12-07 16:37:21,613:-------------------------------------------------------- INFO(140679197693696)2017-12-07 16:37:21,613:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.EG5.Feces.1.1.fasta DEBUG(140679197693696)2017-12-07 16:37:55,232:trimmed to length 250 (1574737 / 1574816 remaining) INFO(140679197693696)2017-12-07 16:37:55,233:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 16:37:55,233:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 16:37:58,741:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 16:37:58,741:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 16:37:58,741:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 16:38:03,461:total sequences 72157, passing sequences 70359, failing sequences 1798 INFO(140679197693696)2017-12-07 16:38:03,461:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 16:38:03,461:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 17:24:56,469:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 17:25:15,344:deblurring 70359 sequences INFO(140679197693696)2017-12-07 18:00:08,487:9000 unique sequences left following deblurring INFO(140679197693696)2017-12-07 18:00:08,875:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 18:00:08,875:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.EG5.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 18:00:21,819:finished processing file INFO(140679197693696)2017-12-07 18:00:23,094:-------------------------------------------------------- INFO(140679197693696)2017-12-07 18:00:23,094:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/Whale.AK1.Feces.1.1.fasta DEBUG(140679197693696)2017-12-07 18:00:53,862:trimmed to length 250 (1177747 / 1177827 remaining) INFO(140679197693696)2017-12-07 18:00:53,863:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 18:00:53,863:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 18:00:57,010:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:00:57,010:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:00:57,011:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 18:01:03,573:total sequences 59507, passing sequences 57979, failing sequences 1528 INFO(140679197693696)2017-12-07 18:01:03,574:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 18:01:03,574:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 18:29:25,722:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 18:29:38,266:deblurring 57979 sequences INFO(140679197693696)2017-12-07 18:44:29,240:6386 unique sequences left following deblurring INFO(140679197693696)2017-12-07 18:44:29,425:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 18:44:29,425:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/Whale.AK1.Feces.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 18:44:35,816:finished processing file INFO(140679197693696)2017-12-07 18:44:35,841:-------------------------------------------------------- INFO(140679197693696)2017-12-07 18:44:35,841:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/ExtractionBlank.B1.whales.1.1.fasta DEBUG(140679197693696)2017-12-07 18:44:36,019:trimmed to length 250 (7880 / 7881 remaining) INFO(140679197693696)2017-12-07 18:44:36,019:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 18:44:36,019:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 18:44:36,055:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:44:36,056:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:44:36,056:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 18:44:36,609:total sequences 644, passing sequences 642, failing sequences 2 INFO(140679197693696)2017-12-07 18:44:36,610:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 18:44:36,610:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 18:44:38,157:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 18:44:38,224:deblurring 642 sequences INFO(140679197693696)2017-12-07 18:44:38,528:257 unique sequences left following deblurring INFO(140679197693696)2017-12-07 18:44:38,531:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 18:44:38,531:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B1.whales.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 18:44:38,782:finished processing file INFO(140679197693696)2017-12-07 18:44:38,783:-------------------------------------------------------- INFO(140679197693696)2017-12-07 18:44:38,783:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/ExtractionBlank.B2.whales.1.1.fasta DEBUG(140679197693696)2017-12-07 18:44:38,834:trimmed to length 250 (833 / 834 remaining) INFO(140679197693696)2017-12-07 18:44:38,834:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 18:44:38,834:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 18:44:38,851:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:44:38,851:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:44:38,851:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 18:44:39,376:total sequences 84, passing sequences 84, failing sequences 0 INFO(140679197693696)2017-12-07 18:44:39,376:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 18:44:39,376:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 18:44:43,474:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 18:44:43,484:deblurring 84 sequences INFO(140679197693696)2017-12-07 18:44:43,497:52 unique sequences left following deblurring INFO(140679197693696)2017-12-07 18:44:43,497:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07 18:44:43,498:system call: ['vsearch', '--uchime_denovo', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep.no_artifacts.msa.deblur', '--nonchimeras', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B2.whales.1.1.fasta.trim.derep.no_artifacts.msa.deblur.no_chimeras', '-dn', '0.000001', '-xn', '1000', '-minh', '10000000', '--mindiffs', '5', '--fasta_width', '0', '--threads', '1'] INFO(140679197693696)2017-12-07 18:44:43,559:finished processing file INFO(140679197693696)2017-12-07 18:44:43,559:-------------------------------------------------------- INFO(140679197693696)2017-12-07 18:44:43,559:launch_workflow for file /home/kevin/projects/islandGut/deblur2/split/ExtractionBlank.B3.whales.1.1.fasta DEBUG(140679197693696)2017-12-07 18:44:43,592:trimmed to length 250 (673 / 673 remaining) INFO(140679197693696)2017-12-07 18:44:43,592:dereplicate seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim DEBUG(140679197693696)2017-12-07 18:44:43,592:system call: ['vsearch', '--derep_fulllength', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim', '--output', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep', '--sizeout', '--fasta_width', '0', '--minuniquesize', '2', '--quiet', '--threads', '1'] INFO(140679197693696)2017-12-07 18:44:43,610:remove_artifacts_seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:44:43,610:running on ref_fp /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir refdb_fp /home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts seqs /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep DEBUG(140679197693696)2017-12-07 18:44:43,610:system call: ['sortmerna', '--reads', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep', '--ref', '/home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa,/home/kevin/projects/islandGut/deblur2/deblur_working_dir/artifacts', '--aligned', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep.sortmerna', '--blast', '3', '--best', '1', '--print_all_reads', '-v', '-e', '100'] INFO(140679197693696)2017-12-07 18:44:44,129:total sequences 57, passing sequences 57, failing sequences 0 INFO(140679197693696)2017-12-07 18:44:44,129:multiple_sequence_alignment seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep.no_artifacts DEBUG(140679197693696)2017-12-07 18:44:44,129:system call: ['mafft', '--quiet', '--preservecase', '--parttree', '--auto', '--thread', '1', '/home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep.no_artifacts'] DEBUG(140679197693696)2017-12-07 18:44:46,240:Using error profile [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(140679197693696)2017-12-07 18:44:46,246:deblurring 57 sequences INFO(140679197693696)2017-12-07 18:44:46,252:44 unique sequences left following deblurring INFO(140679197693696)2017-12-07 18:44:46,253:remove_chimeras_denovo_from_seqs seqs file /home/kevin/projects/islandGut/deblur2/deblur_working_dir/ExtractionBlank.B3.whales.1.1.fasta.trim.derep.no_artifacts.msa.deblurto working dir /home/kevin/projects/islandGut/deblur2/deblur_working_dir DEBUG(140679197693696)2017-12-07

kevinmcc21 commented 6 years ago

Any update here?

I'm also curious to know if deblur records a list of sequences that are considered "known sequencing artifacts (such as PhiX)" for posterity purposes. After changing the minimum read count to 0, I still see ~25% of reads being removed, so I'm curious to know if the rest are all considered sequencing artifacts.

On Fri, Dec 8, 2017 at 11:29 AM, Kevin McCormick < kevinmcc@pennmedicine.upenn.edu> wrote:

The debuglog.txt file is attached.

On Wed, Dec 6, 2017 at 4:39 PM, Kevin McCormick < kevinmcc@pennmedicine.upenn.edu> wrote:

Hm, this does make some sense. But how does it know which reads have errors?

Also, I see that by default the script removes reads with < 10 copies. So that might be accounting for many of the removed reads.

On Wed, Dec 6, 2017 at 4:06 PM, amnona notifications@github.com wrote:

Cool. Deblur does not try to correct reads containing errors but rather throws them away. For error rate 0.005 per nucleotide and read length 150 you expect (1-0.005)^150=~0.5, so approx. Half of the reads will be error free and 0.5 will be thrown away. So your number sounds reasonable. Does this make sense?

On Dec 6, 2017 10:54 PM, "kevinmcc21" notifications@github.com wrote:

No problem, I'll run it right away.

I had an additional question regarding the results. The total read counts in the OTU table produced by deblur seem to only include ~50% of our reads. After removing bad libraries, our previous total read count was ~27M; applying deblur reduces this to ~13M. Does this seem consistent with intended behavior? Our (prior) understanding was that deblur groups reads together, by identifying likely sequencing errors, but we did not think that it actually removes reads from downstream analysis.

I'll take another look at the paper to further my understanding as well. Thanks very much for the help!

On Wed, Dec 6, 2017 at 2:56 PM, amnona notifications@github.com wrote:

Hi, in order to understand the "Problem removing artifacts from file" warning, could you rerun with debug level set to highest details (using the flags: --log-level 1 --log-file debuglog.txt --keep-tmp-files) and then look/send the output in debuglog.txt? My guess is there is a problem with running sortmerna, but best to look at the detailed log file and see :)

Thanks Amnon

On Wed, Dec 6, 2017 at 8:31 PM, kevinmcc21 <notifications@github.com

wrote:

No worries at all.

Yes, this is 16S data.

On Wed, Dec 6, 2017 at 12:38 PM, Daniel McDonald < notifications@github.com

wrote:

That seems plausible. Are these 16S data?

Sorry for not getting back on the other comment. Am sitting on a few deadlines at the moment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/163#issuecomment-3 49717225 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxSfFPQdajAxcXYKirtDaGLQNEth4ks5s9tERgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-3 49732078, or mute the thread https://github.com/notifications/unsubscribe-auth/ AFkA8kzAgmG9Y1COAFpX76VOyIWcXKMKks5s9t1qgaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/163#issuecomment-349756135 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AXbmxU5pyNO08VOJswKo90qUJNs0jDAtks5s9vFQgaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349771614, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8ijdk T0LTzNv_8s9Mn2EB3ksuyUCks5s9v8ggaJpZM4Qy_gF

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-349774674, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxcnkOVbEj0wdIVmyuv9faAZzSk8Pks5s9wHdgaJpZM4Qy_gF .

amnona commented 6 years ago

Hi Kevin, Sorry for the slow response. looking at the log file i don't see the source for the warning you are getting. Did you get the warning when running when you ran deblur this time? Also, since the log files are appended at the end each time you run, maybe github cut the log file and we're not seeing the end of the file here? Anyway, if you still got the error, maybe it would be best to attach one of the sample files (.fasta) (if you are using a single demultiplexed fasta file, the per sample fasta files are in split subdir). Also, can you send the exact command you use to run deblur? and also the output of "sortmerna --version"?

Regarding your additional questions:

  1. PhiX reads are removed before deblurring since they are a known sequence contaminant. There is no option to save those reads, but they are filtered based on the sequence in file: /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-packages/deblur/support_files/artifacts.fa using the criteria of 95% identity and 95% coverage.

    • Note that you can change the sequencing artifacts file (which by default contains phix) using the --neg-ref-fp parameter. You can also see in the log file the amount of phix reads in each sample in the lines similar to: INFO(140151359256320)2017-12-06 16:12:43,881:total sequences 43321, passing sequences 41702, failing sequences 1619
  2. I think the large amount of reads you see as dropped are due to the fact that deblur does not correct reads with errors but rather drops them. but you can look at the log file/read files after each step and see how many are left and where you lose most of the reads (could also be some problem with the run - i.e. a lot of chimeras/phix reads but i don't think so).

Does this make sense? Amnon

kevinmcc21 commented 6 years ago

Hello again,

Not sure if my last email went through. I had a 19M file attached but apparently github did not like that. What file size is allowable? The error message does not say what the limit is. I will send a .fasta file from the split folder that meets the standard.

Please see my other responses in the email below.

Kevin

On Mon, Dec 18, 2017 at 4:14 PM, Kevin McCormick < kevinmcc@pennmedicine.upenn.edu> wrote:

Hi Amnon,

I get the warning every time I run deblur to completion.

Attaching a sample .fasta file from the split directory.

The command used for this run was: deblur workflow --seqs-fp library/seqs.fna --output-dir deblur2/ -t 250 --keep-tmp-files --log-level 1 --log-file debuglog.txt

Output of sortmerna --version: SortMeRNA version 2.0, 29/11/2014

I understand the difference between "reads with errors" and "chimeras/phix reads" but I do not understand how deblur knows what to call a "read with errors" with only having a .fasta file as input (and not a .fastq file with read quality scores). So if it is indeed dropping reads with errors, it makes me very uneasy because I have no way of checking that removed reads do in fact look like errors. How is a read concluded to be an error? Is it based on read distribution/count?

Thanks, Kevin

On Fri, Dec 15, 2017 at 3:09 PM, amnona notifications@github.com wrote:

Hi Kevin, Sorry for the slow response. looking at the log file i don't see the source for the warning you are getting. Did you get the warning when running when you ran deblur this time? Also, since the log files are appended at the end each time you run, maybe github cut the log file and we're not seeing the end of the file here? Anyway, if you still got the error, maybe it would be best to attach one of the sample files (.fasta) (if you are using a single demultiplexed fasta file, the per sample fasta files are in split subdir). Also, can you send the exact command you use to run deblur? and also the output of "sortmerna --version"?

Regarding your additional questions:

  1. PhiX reads are removed before deblurring since they are a known sequence contaminant. There is no option to save those reads, but they are filtered based on the sequence in file: /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-pack ages/deblur/support_files/artifacts.fa using the criteria of 95% identity and 95% coverage.

    • Note that you can change the sequencing artifacts file (which by default contains phix) using the --neg-ref-fp parameter. You can also see in the log file the amount of phix reads in each sample in the lines similar to: INFO(140151359256320)2017-12-06 16:12:43,881:total sequences 43321, passing sequences 41702, failing sequences 1619
  2. I think the large amount of reads you see as dropped are due to the fact that deblur does not correct reads with errors but rather drops them. but you can look at the log file/read files after each step and see how many are left and where you lose most of the reads (could also be some problem with the run - i.e. a lot of chimeras/phix reads but i don't think so).

Does this make sense? Amnon

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-352099997, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxVPMdMMl4lmJ9u_-6962ggWzF3i8ks5tAtHegaJpZM4Qy_gF .

amnona commented 6 years ago

Hi Kevin, Regarding the error reads - this is the heart of the deblur algorithm: since it knows the (upper bound) error profile for the illumina run for each hamming distance, we know the (upper bound) on the error reads expected for a real sequence. say our highest freq. sequence is AAA with 1000 reads, and we know the hamming 1 upper bound on error is 1%, we know up to 10 reads (1% of 1000 reads) can be due to errors, and therefore we remove 10 reads from each hamming one sequence which is present. Based on our experience, the q-score (from fastq) is not good enough since it is an average error probability and not upper bound, as well as not including the PCR errors. Deblur throws away the reads which are potentially pcr/read errors based on the neighboring sequences and the error profile. Does this make sense?

Regarding the warning:

  1. i don't know the github limit - try to put the smallest sample file you have and let's see if it works :) (or even just the first 1000 lines - assuming you still get the warning on this reduced file).
  2. the sortmerna version looks good.
  3. can you send just the tail (say last 100 lines) of the log file (debuglog.txt)? it's a very strange error - but hopefully we'll get to the bottom of it :)

thanks Amnon

On Mon, Dec 18, 2017 at 11:30 PM, kevinmcc21 notifications@github.com wrote:

Hello again,

Not sure if my last email went through. I had a 19M file attached but apparently github did not like that. What file size is allowable? The error message does not say what the limit is. I will send a .fasta file from the split folder that meets the standard.

Please see my other responses in the email below.

Kevin

On Mon, Dec 18, 2017 at 4:14 PM, Kevin McCormick < kevinmcc@pennmedicine.upenn.edu> wrote:

Hi Amnon,

I get the warning every time I run deblur to completion.

Attaching a sample .fasta file from the split directory.

The command used for this run was: deblur workflow --seqs-fp library/seqs.fna --output-dir deblur2/ -t 250 --keep-tmp-files --log-level 1 --log-file debuglog.txt

Output of sortmerna --version: SortMeRNA version 2.0, 29/11/2014

I understand the difference between "reads with errors" and "chimeras/phix reads" but I do not understand how deblur knows what to call a "read with errors" with only having a .fasta file as input (and not a .fastq file with read quality scores). So if it is indeed dropping reads with errors, it makes me very uneasy because I have no way of checking that removed reads do in fact look like errors. How is a read concluded to be an error? Is it based on read distribution/count?

Thanks, Kevin

On Fri, Dec 15, 2017 at 3:09 PM, amnona notifications@github.com wrote:

Hi Kevin, Sorry for the slow response. looking at the log file i don't see the source for the warning you are getting. Did you get the warning when running when you ran deblur this time? Also, since the log files are appended at the end each time you run, maybe github cut the log file and we're not seeing the end of the file here? Anyway, if you still got the error, maybe it would be best to attach one of the sample files (.fasta) (if you are using a single demultiplexed fasta file, the per sample fasta files are in split subdir). Also, can you send the exact command you use to run deblur? and also the output of "sortmerna --version"?

Regarding your additional questions:

  1. PhiX reads are removed before deblurring since they are a known sequence contaminant. There is no option to save those reads, but they are filtered based on the sequence in file: /home/kevin/anaconda3/envs/deblurenv/lib/python3.5/site-pack ages/deblur/support_files/artifacts.fa using the criteria of 95% identity and 95% coverage.
  • Note that you can change the sequencing artifacts file (which by default contains phix) using the --neg-ref-fp parameter. You can also see in the log file the amount of phix reads in each sample in the lines similar to: INFO(140151359256320)2017-12-06 16:12:43,881:total sequences 43321, passing sequences 41702, failing sequences 1619
  1. I think the large amount of reads you see as dropped are due to the fact that deblur does not correct reads with errors but rather drops them. but you can look at the log file/read files after each step and see how many are left and where you lose most of the reads (could also be some problem with the run - i.e. a lot of chimeras/phix reads but i don't think so).

Does this make sense? Amnon

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-352099997, or mute the thread https://github.com/notifications/unsubscribe-auth/AXbmxVPMdMMl4lmJ9u_- 6962ggWzF3i8ks5tAtHegaJpZM4Qy_gF .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/163#issuecomment-352563327, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8lU86tOODMrGhhTBgX0VfW8DDhYnks5tBtlZgaJpZM4Qy_gF .

wasade commented 6 years ago

The pytz issue is upstream of us in pandas so out of scope I believe. It's unclear if there was follow up on the warnings, so closing this issue for now. Please reopen if the warnings are still a problem.