qiime2 / docs

https://docs.qiime2.org
BSD 3-Clause "New" or "Revised" License
21 stars 58 forks source link

parkinson's mouse tutorial should document `fragment-insertion filter-features` step after `fragment-insertion sepp` #440

Open gregcaporaso opened 4 years ago

gregcaporaso commented 4 years ago

Improvement Description qiime fragment-insertion sepp is applied in the Parkinson's mouse tutorial, but isn't followed with a qiime fragment-insertion filter-features step to remove ASVs that weren't inserted into the tree from the feature table. We should document qiime fragment-insertion filter-features at this point in the tutorial since it is often necessary after running qiime fragment-insertion sepp. This content could be derived from the tutorial provided here.

Current Behavior It seems that all of the ASVs in this tutorial are successfully inserted into the tree, so there are no failures in downstream steps.

Proposed Behavior We should document that this step is often necessary to ensure that all of the features in the feature table are also present in the tree. We can note that the filter isn't necessary in this tutorial, but that it often will be. This can happen in a box (probably where we explain how to determine if it's necessary), or just include the filtering command even though it doesn't change anything here (that might be safer in case a change in an upstream step such as denoising results in one or more ASVs not being placed in this process in the future, in which case the build of this tutorial would fail).

mestaki commented 4 years ago

In my own workflows I tend to do the filtering always even though I've actually never had a case yet where features were not inserted into the tree. For reasons Greg mentioned I think this is a better approach than a box . That being said, I was thinking of opening a new issue to propose a new pipeline where the filtering is done after fragment-insertion in one step, or maybe an easier approach would be simply for fragment insertion to raise a warning when >0 features failed to be placed in the tree and this would advise the user to perform the filtering step. If this makes sense I'll go ahead and create an issue?

thermokarst commented 4 years ago

maybe an easier approach would be simply for fragment insertion to raise a warning when >0 features failed to be placed in the tree and this would advise the user to perform the filtering step. If this makes sense I'll go ahead and create an issue?

Maybe we should return the filtered FeatureData[Sequence] from the sepp command. Then it can be used for downstream filtering pretty easily (also is easily summarized, etc.)