biocore / songbird

Vanilla regression methods for microbiome differential abundance analysis
BSD 3-Clause "New" or "Revised" License
58 stars 25 forks source link

Improved documentation and Q2 summary visualizations; centralizing parameters #80

Closed fedarko closed 5 years ago

fedarko commented 5 years ago

An attempt to summarize new stuff:

OK, I think that's the bulk of it. Please let me know if you'd like any clarification or for me to change anything. In sum, I think these changes will make using Songbird a lot more pleasant (and understandable) for many people. Thanks!

fedarko commented 5 years ago

Also, it'd probably be a good idea for multiple people to review this since this is a lot of changes, and it's important that they're correct. @lisa55asil if you'll be around tomorrow, would you mind sitting down and going over this with me? A lot of the documentation changes here involved your FAQs, so you'd probably be the most qualified person to comment on that :)

mortonjt commented 5 years ago

@fedarko , this is awesome! Will take a close look at it today / tomorrow.

fedarko commented 5 years ago

Thanks! @lisa55asil is looking over it right now, so I think we can merge this in once we incorporate her feedback.

One quick question—there's this part of the FAQ:

Q. What's up with the --training-column argument?

A. That is used for cross-validation if you have a specific reproducibility question that you are interested in answering. If this is specified, only samples labeled "Train" under this column will be used for building the model and samples labeled "Test" will be used for cross validation. In other words the model will attempt to predict the microbe abundances for the "Test" samples. The resulting prediction accuracy is used to evaluate the generalizability of the model in order to determine if the model is overfitting or not. If this argument is not specified, then 10 random samples will be chosen for the test dataset. If you want to specify more random samples to allocate for cross-validation, the --num-random-test-examples argument can be specified.

This says "10 random samples", but the default for --num-random-test-examples is 5. Should we change that?

fedarko commented 5 years ago

Ok, Lisa's changes have been integrated in. We've both made some changes (I fixed the --num-random-test-examples thing, as well as tidied up the discussions of # of iterations). Also tried to clean up the "baseline model" description in the summarize-paired help here.

@mortonjt, pending your final approval I think this should be good to merge in.

fedarko commented 5 years ago

...made a few last-minute commits showing off using qiime metadata tabulate on a differentials file, but i think that's it from me for right now :)

lisa55asil commented 5 years ago

That's awesome! If its easy and you already have it a link to the output of tabulating the differentials.qza from the Red Sea data would be helpful I think.

On Mon, Sep 23, 2019 at 3:31 PM Marcus Fedarko notifications@github.com wrote:

...made a few last-minute commits showing off using qiime metadata tabulate on a differentials file, but i think that's it from me for right now :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/songbird/pull/80?email_source=notifications&email_token=AE6EV4WU2TFHC6UYXCRBDWDQLE7SLA5CNFSM4IYSYWZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7MO2TA#issuecomment-534310220, or mute the thread https://github.com/notifications/unsubscribe-auth/AE6EV4UX7Y5V7ROPYBL4BI3QLE7SLANCNFSM4IYSYWZA .

fedarko commented 5 years ago

Ok! I'm not sure adding a link to a QZV is the best idea, since I guess that'd have to be periodically updated if we also update Songbird... but I added a screenshot of the QZV output. I think that's a good advertisement for that functionality :) (here's a link to the screenshot)

mortonjt commented 5 years ago

Noting qiime metadata tabulate is a great idea!

mortonjt commented 5 years ago

@fedarko are you finished with edits?

fedarko commented 5 years ago

Going to make one more pass over everything today. I'll try to have it done by 5pm EST.

Thanks! Marcus

On Tue, Sep 24, 2019, 9:00 AM Jamie Morton notifications@github.com wrote:

@fedarko https://github.com/fedarko are you finished with edits?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/songbird/pull/80?email_source=notifications&email_token=AA736PYLQZ5MMMHB364XZY3QLI2TPA5CNFSM4IYSYWZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7O4EAA#issuecomment-534626816, or mute the thread https://github.com/notifications/unsubscribe-auth/AA736P4Y5QAKAM273AMFUITQLI2TPANCNFSM4IYSYWZA .

fedarko commented 5 years ago

@mortonjt I'm done! Added a small subsection clearly describing setting the reference for categorical formulas, since this has confused a lot of people (me included) in the past.

Should be ready to merge in—thanks @lisa55asil + @mortonjt for your help with this :)

lisa55asil commented 5 years ago

This is looking dope Marcus! I really like the clarification about the reference for categorical formulas. I added a short sentence that I think helps make this description more tangible.

On Tue, Sep 24, 2019 at 2:01 PM Marcus Fedarko notifications@github.com wrote:

@mortonjt https://github.com/mortonjt I'm done! Added a small subsection clearly describing setting the reference for categorical formulas, since this has confused a lot of people (me included) in the past.

Should be ready to merge in—thanks @lisa55asil https://github.com/lisa55asil + @mortonjt https://github.com/mortonjt for your help with this :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/biocore/songbird/pull/80?email_source=notifications&email_token=AE6EV4XZ6EMQNRAACDYSSLDQLJ527A5CNFSM4IYSYWZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7PZK6A#issuecomment-534746488, or mute the thread https://github.com/notifications/unsubscribe-auth/AE6EV4WIQYV5KM4JVLNRLYLQLJ527ANCNFSM4IYSYWZA .

fedarko commented 5 years ago

@mortonjt I'm done with editing, so provided you're cool with it I think this is ready to go. Thanks!

mortonjt commented 5 years ago

Awesome. Thanks @fedarko !