biocore / oecophylla

shotgun pipeline
MIT License
11 stars 19 forks source link

Added Centrifuge rule #100

Closed qiyunzhu closed 7 years ago

qiyunzhu commented 7 years ago

A few notes:

1) Although we are still unsure if it is recommended to reestimate per-rank relative abundance using Bracken on Centrifuge output, I chose to do so for now, after talking with Gabe and based on my understanding.

2) The organization of config.yaml and parser may be modified to be more generic rather than specific. I didn't make any change in this PR, but we can discuss on it.

tanaes commented 7 years ago

Awesome, thank! Will take a look.

qiyunzhu commented 7 years ago

@sjanssen2 @mortonjt Wanna take a look? Thanks!

qiyunzhu commented 7 years ago

@mortonjt I guess "wet run" means running it locally with actual profiling operation enabled. The answer is yes. All my PRs were tested and worked locally.

tanaes commented 7 years ago

Have you seen https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.replace.html? I like to use it for that sort of operation.

On Mon, Oct 2, 2017 at 12:57 PM Qiyun Zhu notifications@github.com wrote:

@qiyunzhu commented on this pull request.

In oecophylla/taxonomy/taxonomy.rule https://github.com/biocore/oecophylla/pull/100#discussion_r142195408:

@@ -200,19 +201,124 @@ rule taxonomy_kraken_combine_profiles: "benchmarks/taxonomy/taxonomy_kraken_combine_profiles.txt" run: pandas2biom(output[0], combine_profiles(zip(samples, input)))

  • if params['levels']:
  • for level in params['levels'].split(','):
  • redists = ['%s.redist.%s.txt' % (x[:-12], level)
  • for x in input]
  • pandas2biom('%s_redist.%s.biom' % (output[0][:-13], level),
  • combine_bracken(zip(samples, redists)))
  • for level in params['levels'].split(','):
  • redists = ['%s.redist.%s.txt' % (x[:-12], level)
  • for x in input]
  • pandas2biom('%s_redist.%s.biom' % (output[0][:-13], level),

This [:-13] is to remove _profile.biom from xxx_profile.biom. Is there a better way for doing so?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/biocore/oecophylla/pull/100#discussion_r142195408, or mute the thread https://github.com/notifications/unsubscribe-auth/AH6JAHmWBYzASf4jWCU-Xn7sIicIwu-7ks5soRXcgaJpZM4Pq4Lt .

qiyunzhu commented 7 years ago

@tanaes Thanks! I know this Pandas trick. But I was working on a simple Python string, not a Pandas.

qiyunzhu commented 7 years ago

Hi @mortonjt look at new lines #311-316. Do you like this style? (sorry I just saw your reply)

qiyunzhu commented 7 years ago

@tanaes @mortonjt Feel free to click "merge" :)