New multi-sample toy data set

abremges commented 7 years ago

For the second mode of Opal (cross-sample comparisons to check how well tools capture e.g. changes in alpha diversity), we need to simulate a small and controlled data set, generate the gold standard, and run different profilers on it.

@fernandomeyer and I will think about it and discuss the experimental setup (probably involving @AlphaSquad, too, if CAMISIM can do the job).

alicemchardy commented 7 years ago

We can use the medium complexity data set and results for this - it is diff abundance. Also the time series high complexity could be analysed that way.

Best, Alice

Sent from my phone

Am 15.11.2017 um 12:13 schrieb Andreas Bremges notifications@github.com<mailto:notifications@github.com>:

For the second mode of Opal (cross-sample comparisons to check how well tools capture e.g. changes in alpha diversity), we need to simulate a small and controlled data set, generate the gold standard, and run different profilers on it.

@fernandomeyerhttps://github.com/fernandomeyer and I will think about it and discuss the experimental setup (probably involving @AlphaSquadhttps://github.com/alphasquad, too, if CAMISIM can do the job).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/CAMI-challenge/OPAL/issues/20, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACAH4QurRbA8YJqGR9AiaYcxql40hl_Rks5s2sdVgaJpZM4Qew86.

Helmholtz-Zentrum für Infektionsforschung GmbH | Inhoffenstraße 7 | 38124 Braunschweig | www.helmholtz-hzi.de

Vorsitzende des Aufsichtsrates: MinDir’in Bärbel Brumme-Bothe, Bundesministerium für Bildung und Forschung Stellvertreter: MinDirig Rüdiger Eichel, Niedersächsisches Ministerium für Wissenschaft und Kultur Geschäftsführung: Prof. Dr. Dirk Heinz; Silke Tannapfel Gesellschaft mit beschränkter Haftung (GmbH) Sitz der Gesellschaft: Braunschweig Handelsregister: Amtsgericht Braunschweig, HRB 477

abremges commented 7 years ago

But CAMI's MC and HC data sets have the same alpha diversity (same species occur with variable abundances). I thought one aspect was to assess how well tools detect varying community complexity!?

alicemchardy commented 7 years ago

Hmm true.. though they might still fail on that on these datasets, as sometimrs abundances are very low..

Re: measures of complexity, got some input from Ruben, who likes Shannon ( for abundance, though Davids examples made me even more worried), some tree-based index (childs? Or faith? Will check) and simply counting taxa for comparing abundances at a certain sampling depth and rank, maybe rarefaction curves would be a good/better idea?

Best, Alice

Sent from my phone

Am 15.11.2017 um 18:02 schrieb Andreas Bremges notifications@github.com<mailto:notifications@github.com>:

But CAMI's MC and HC data sets have the same alpha diversity (same species occur with variable abundances). I thought one aspect was to assess how well tools detect varying community complexity!?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/CAMI-challenge/OPAL/issues/20#issuecomment-344658910, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACAH4YfFQSeHM_mGvSrwWVYgxkAY9B1nks5s2xkwgaJpZM4Qew86.

Helmholtz-Zentrum für Infektionsforschung GmbH | Inhoffenstraße 7 | 38124 Braunschweig | www.helmholtz-hzi.de

Vorsitzende des Aufsichtsrates: MinDir’in Bärbel Brumme-Bothe, Bundesministerium für Bildung und Forschung Stellvertreter: MinDirig Rüdiger Eichel, Niedersächsisches Ministerium für Wissenschaft und Kultur Geschäftsführung: Prof. Dr. Dirk Heinz; Silke Tannapfel Gesellschaft mit beschränkter Haftung (GmbH) Sitz der Gesellschaft: Braunschweig Handelsregister: Amtsgericht Braunschweig, HRB 477

sjanssen2 commented 7 years ago

I always look at "observed_otus" (= raw number of different species regardless of their abundance), "shannon" and "Faith Phylogenetic Diversity" (which requires a phylogenetic tree and assess covered branch lengths). I like the latter the most, but do we have a phylogeny for organisms in CAMI data? You might want to take a look at https://docs.qiime2.org/2017.10/plugins/available/diversity/alpha-rarefaction/ for rarefaction curves

alicemchardy commented 7 years ago

These are the same ones, perfect! We habe the taxonomy, for Faith.

Best, Alice

Sent from my phone

Am 15.11.2017 um 20:10 schrieb Stefan Janssen notifications@github.com<mailto:notifications@github.com>:

I always look at "observed_otus" (= raw number of different species regardless of their abundance), "shannon" and "Faith Phylogenetic Diversity" (which requires a phylogenetic tree and assess covered branch lengths). I like the latter the most, but do we have a phylogeny for organisms in CAMI data? You might want to take a look at https://docs.qiime2.org/2017.10/plugins/available/diversity/alpha-rarefaction/ for rarefaction curves

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/CAMI-challenge/OPAL/issues/20#issuecomment-344696417, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACAH4ZpcAk7EzJTirBKIZ6SdVWLPY7rWks5s2zcbgaJpZM4Qew86.

Helmholtz-Zentrum für Infektionsforschung GmbH | Inhoffenstraße 7 | 38124 Braunschweig | www.helmholtz-hzi.de

Vorsitzende des Aufsichtsrates: MinDir’in Bärbel Brumme-Bothe, Bundesministerium für Bildung und Forschung Stellvertreter: MinDirig Rüdiger Eichel, Niedersächsisches Ministerium für Wissenschaft und Kultur Geschäftsführung: Prof. Dr. Dirk Heinz; Silke Tannapfel Gesellschaft mit beschränkter Haftung (GmbH) Sitz der Gesellschaft: Braunschweig Handelsregister: Amtsgericht Braunschweig, HRB 477

alicemchardy commented 7 years ago

Correction: for CAMI we so far only have predictions for the pooled data sets, not individual CAMI samples.. So at least we have to rerun indivdiual tools for individual samples.

fernandomeyer commented 6 years ago

Cross-sample comparisons have been done using CAMI I high and CAMI II mouse gut data. Alpha/beta diversities can now be measured with Shannon, rarefaction curves...

CAMI-challenge / OPAL

New multi-sample toy data set #20