UCSF-DSCOLAB / data_processing_pipelines

A repository to store the existing pipelines to process the various CoLabs datasets
0 stars 1 forks source link

cap nCount_ADT axis to 25000 in diagnostic plots #71

Open dtm2451 opened 5 months ago

dtm2451 commented 5 months ago

Simple update: Uses x/ylim to enforce [0,25000] as range for "nCount_ADT" axes of diagnostic plots.

Current issue: I've found most of my CITEseq libraries to have handfuls of outlier cells that have values in the hundreds of thousand range for this metric whereas the bulk of most cells will fall into the 1,000-15,000 range. This common phenomenon causes the bulk of the data to be shrunk into a sliver at the lower end of this axis, which then effectively renders the plots unusable for their intended purpose -- allowing the user dial in a cutoff. The range of values where a cutoff might be drawn from ends up have very little resolution.

The fix here: This PR adjusts the function used for scatterplot creation to enforce axis limits of 0 to 25,000 for "nCount_ADT"-axes.

erflynn commented 5 months ago

this is great, feel free to merge!

dtm2451 commented 5 months ago

I'll merge pending a test on real data! (Since I've only used a version of the fix via running a separate script and haven't yet tested via the pipeline with this PR-branch.)