Closed sklages closed 3 years ago
Could you provide a screen shot of the plot you are looking for? Is it the per lane box plot at the bottom center of the Analysis tab?
Checking for overclustered flowcells on NovaSeq:
When simply plotting imaging_table
output (x='% Occupied'
, y='% Pass Filter'
) I get a similar, but not identical image (both series sorted):
This has not yet been separated by lane ...
Ah, I see. Thanks for the screen shot, I definitely misunderstood what you were asking.
The InterOp library provides most of the plots in SAV, except the ones from the imaging table. Those are built using some legacy plotting library outside of InterOp.
We have not developed any python code to do what you are trying to do. So, getting them to match exactly is probably not worth the effort.
The biggest difference I see between the two plots is that the SAV code is filtering the data by the Lane
column and plotting a series per Lane. Another change would be to reduce the size of the marker you are using.
I personally like plotly express. If you work from a pandas data frame, then you can do it in one line
plotly.express.scatter(df, '% Occupied', '% Pass Filter', color='Lane')
Thanks for your advice!
Yes, I still need to separate per lane. I also need to reduce the number of datapoints, I guess ...
For custom plots outside standard SAV it is probably best to stick with imaging_table
output ..
Plotly Express looks good .. I will have a look. Thanks for this :-)
I wanted to parse
imaging_table
output in Python to plot% Occupied
vs% PF
per lane to get a similar image like in SAV (for NovaSeq data).Well, the plots do not look the same, so I assume you are not simply taking all values of both columns and plot these as scatter plot? Looks like in SAV there are less data points...
Alternatively, how can I accomplish this using the Python interop bindings? The docs are hard to read (for me as a Python beginner) ..
thank you.