biocore / qiime

Official QIIME 1 software repository. QIIME 2 (https://qiime2.org) has succeeded QIIME 1 as of January 2018.
GNU General Public License v2.0
286 stars 265 forks source link

set seed for rarefaction #1679

Open wdwvt1 opened 10 years ago

wdwvt1 commented 10 years ago

There was a request for a seed on single rarefaction to make sure that rarefaction was repeatable.

This would be easy, and I envision default behavior of printing the seed (when no seed is specified) and then if a seed is specified, setting it at the same point.

If this sounds okay, go ahead and assign to me.

ElDeveloper commented 10 years ago

YES!! I love this idea!

On (Sep-25-14|14:30), Will Van Treuren wrote:

There was a request for a seed on single rarefaction to make sure that rarefaction was repeatable.

This would be easy, and I envision default behavior of printing the seed (when no seed is specified) and then if a seed is specified, setting it at the same point.

If this sounds okay, go ahead and assign to me.


Reply to this email directly or view it on GitHub: https://github.com/biocore/qiime/issues/1679

antgonza commented 10 years ago

Do you think this will lead to abuse of rarefaction with 1234?

ElDeveloper commented 10 years ago

@antgonza, why would that be a problem?

On (Sep-25-14|14:40), Antonio Gonzalez wrote:

Do you think this will lead to abuse of rarefaction with 1234?


Reply to this email directly or view it on GitHub: https://github.com/biocore/qiime/issues/1679#issuecomment-56888415

antgonza commented 10 years ago

That is not random anymore and everyone will use the same seed for everything ...

ElDeveloper commented 10 years ago

I see, though that is not bad or I don't see why that could be bad.

On (Sep-25-14|15:01), Antonio Gonzalez wrote:

That is not random anymore and everyone will use the same seed for everything ...


Reply to this email directly or view it on GitHub: https://github.com/biocore/qiime/issues/1679#issuecomment-56890805

antgonza commented 10 years ago

The issue s abuse, imagine the situation where a given result is due to rarefaction and you can force it. Now, I think is worth adding but I wonder if we can prevent abuse.

wdwvt1 commented 10 years ago

hmm, this is a stats problem out of my league. my feeling is that there is a reasonable risk that people will seed at 0 or something common. however, i am not sure apriori why thats a problem.

for instance: different otu vectors (from different otu tables with different numbers/abundances of bugs) will cause the rarefaction to be different even if seeded the same. unless the seed is in a bad section of the mersenne twister, e.g. a section that produces numbers that fail some sort of 'randomness' test, i don't know why everyone using the same seed would be bad. if the numbers are still pretty 'random' it seems okay.

On Thu, Sep 25, 2014 at 3:22 PM, Yoshiki Vázquez Baeza < notifications@github.com> wrote:

I see, though that is not bad or I don't see why that could be bad.

On (Sep-25-14|15:01), Antonio Gonzalez wrote:

That is not random anymore and everyone will use the same seed for everything ...


Reply to this email directly or view it on GitHub: https://github.com/biocore/qiime/issues/1679#issuecomment-56890805

— Reply to this email directly or view it on GitHub https://github.com/biocore/qiime/issues/1679#issuecomment-56893045.

wasade commented 10 years ago

Users might assume that multiple calls to "multiple_rarefactions.py" on the same data would result in different output.

I like the idea of being able to optionally specify the seed, but not force a particular one or require it.

On Thu, Sep 25, 2014 at 4:31 PM, Will Van Treuren notifications@github.com wrote:

hmm, this is a stats problem out of my league. my feeling is that there is a reasonable risk that people will seed at 0 or something common. however, i am not sure apriori why thats a problem.

for instance: different otu vectors (from different otu tables with different numbers/abundances of bugs) will cause the rarefaction to be different even if seeded the same. unless the seed is in a bad section of the mersenne twister, e.g. a section that produces numbers that fail some sort of 'randomness' test, i don't know why everyone using the same seed would be bad. if the numbers are still pretty 'random' it seems okay.

On Thu, Sep 25, 2014 at 3:22 PM, Yoshiki Vázquez Baeza < notifications@github.com> wrote:

I see, though that is not bad or I don't see why that could be bad.

On (Sep-25-14|15:01), Antonio Gonzalez wrote:

That is not random anymore and everyone will use the same seed for everything ...


Reply to this email directly or view it on GitHub: https://github.com/biocore/qiime/issues/1679#issuecomment-56890805

— Reply to this email directly or view it on GitHub https://github.com/biocore/qiime/issues/1679#issuecomment-56893045.

— Reply to this email directly or view it on GitHub https://github.com/biocore/qiime/issues/1679#issuecomment-56893858.

ElDeveloper commented 10 years ago

I don't think we could prevent abuse. So yeah, :+1: to this feature.

On (Sep-25-14|15:29), Antonio Gonzalez wrote:

The issue s abuse, imagine the situation where a given result is due to rarefaction and you can force it. Now, I think is worth adding but I wonder if we can prevent abuse.


Reply to this email directly or view it on GitHub: https://github.com/biocore/qiime/issues/1679#issuecomment-56893707