qiime2 / q2-demux

BSD 3-Clause "New" or "Revised" License
0 stars 20 forks source link

BUG: Do not assume single end is all forward #118

Closed Oddant1 closed 4 years ago

Oddant1 commented 4 years ago

Not sure how to make test data for this because I don't know how to make single end reads that all look like reverse reads.

ref: https://forum.qiime2.org/t/error-while-visualization/15743

Oddant1 commented 4 years ago

There's some goofiness here that I haven't resolved yet hence the sister PR. All changes post https://github.com/qiime2/q2-demux/pull/118/commits/88c16dc5780a209e2529869d096d1e67d1e4aa4b including the changes in the sister pr are for the sole purpose of making the test work (which is obviously a bit goofy).

Oddant1 commented 4 years ago

The problem is basically this. Because the test is importing the data as CasavaOneEightSingleLanePerSampleDirFmt there is no manifest included with the data (correct me if I'm wrong, but I believe this format is necessary to test this particular behavior since it has no manifest file and with a manifest this error doesn't exist). The manifest is instead created here when the code first attempts to access it.

if isinstance(data.manifest, pd.DataFrame):
    manifest = data.manifest
else:
    manifest = data.manifest.view(pd.DataFrame)

This is where you'll notice the first little bit of weirdness. Because the manifest is created by that call, it is created as a DataFrame not a BoundFile causing that .view to fail because pd.DataFrame has no .view hence the check. Even with that call, the manifest is created without a forward column at all causing a KeyError later on when the code attempts to access that column. That is what is resolved by the sister PR.

I'm not sure how to resolve this. When I execute summarize through the command line on the same data, the manifest is created as a BoundFile and everything goes fine. How do I give the CasavaOneEightSingleLanePerSampleFirFmt a BoundFile manifest in the test? If that isn't a thing I can do, how else can I test this behavior? Thank you.

Oddant1 commented 4 years ago

@thermokarst can you take a look at this when you get a chance? I'm not convinced the way I'm implementing this test is good, but all my attempts to make things work better have been foiled thus far leading things to where they are now (see the above for context). I suggest pulling just this PR and running the demux tests and viewing the error then also pulling the sister PR and running the demux tests again (they should pass with both PRs). If the current solution is acceptable, great. If not, and I am missing something as I suspect I am, any input would be appreciated.

thermokarst commented 4 years ago

Yeah, this is still on my radar, hopefully we can get it in for 2020.8...

Oddant1 commented 4 years ago

I created a branch where I rolled back the changes I made to add the tests and am trying to start again, and I noticed another issue that I didn't before. When you visualize an artifact with only reverse reads the interactive quality plot for reverse only is properly rendered. But the frequency histogram isn't


image

EDIT: I have now pulled @ebolyen's commit and will see if I can fix the viz real quick off of that commit