Open yeroslaviz opened 5 months ago
Hi @yeroslaviz
I can help you load your results in a python notebook using ScreenPro2 and then visualize volcano plots for each comparisons (i.e., rho scores). ScreenProcessing can be limited for exploratory data analysis. But you can still rely on the result tables for down stream analysis.
Best, Abe
This would be great,
I have installed screenPro2 on my server in a conda
environment (python 3.12).
What do you need to know?
thanks
you have these three files, right?
{experimentName}_librarytable.txt
=> library table
{experimentName}_mergedcountstable.txt
=> merged counts table
{experimentName}_phenotypetable.txt
=> phenotype table
Also, can you tell which version of ScreenPro2 you have installed? I would recommend newer versions. You can update it in your conda
env
pip install --upgrade ScreenPro2
This should be a good place to start:
You can use loadScreenProcessingData
function to load the result tables from your ScreenProcessing run. Then, you need some custom code to make relevant object. Here is an example code (you probably need to play around with codes):
import pandas as pd
import anndata as ad
from screenpro.load import loadScreenProcessingData
from screenpro.assays import PooledScreens
def mergeColumnIndex(df_in):
df = df_in.copy()
df.columns = df.columns.map(' '.join)
return df
def convertScreenProcessing_to_ScreenPro(data_in):
if type(data_in) is str:
data = loadScreenProcessingData(data_in)
else:
data = data_in.copy()
adata = ad.AnnData(
var = data['counts'].columns.to_frame().rename(columns={0:'treatment',1:'replicate'}).reset_index(drop=True).rename(index=lambda s: f'sample_{str(s)}'),
obs=data['library']
)
adata.X = data['counts'].loc[adata.obs.index,:].to_numpy()
screen = PooledScreens(adata)
screen.phenotypes.update({
'phenotypes': mergeColumnIndex(data['phenotypes']),
"transcript_scores": data['transcript scores'],
"gene_scores": data['gene scores']
})
# adata.uns['transcript scores'] = mergeColumnIndex(data['transcript scores']).reset_index().set_index('gene')
# adata.uns['gene scores'] = mergeColumnIndex(data['gene scores'])
# adata.transcript_scores = data['transcript scores']
# adata.gene_scores = data['gene scores']
return screen
Once you make this work, you can use plotting
module.
you have these three files, right?
{experimentName}_librarytable.txt
=> library table{experimentName}_mergedcountstable.txt
=> merged counts table{experimentName}_phenotypetable.txt
=> phenotype tableAlso, can you tell which version of ScreenPro2 you have installed? I would recommend newer versions. You can update it in your
conda
envpip install --upgrade ScreenPro2
I'm using the version 0.3.1
Great, could you load ScreenProcessing result tables similar to https://github.com/mhorlbeck/ScreenProcessing/issues/34#issuecomment-2101413920?
You can also directly use ScreenPro2, that might be even simpler TBH!
Not sure I understand how to load the tables.
Not sure I understand how to load the tables.
Sorry about the confusion. The code I shared is an example that I used personally to load ScreenProcessing results for a project and then I could use ScreenPro2 to draw volcano plots.
Here is the source code for loadScreenProcessingData
. It's simply loading tables into python as dataframes. Let me know if it makes more sense now.
Once you load the tables and annotate columns accordingly, you can use plot_volcano
function. e.g.
import matplotlib.pyplot as plt
import screenpro as scp
# Create subplots and specify the size
fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(4, 1.8))
scp.pl.plot_volcano(ax1, gamma,threshold=5,xlim_l=-2, xlim_r=1,ctrl_label='negative_control')
ax1.grid(False)
ax1.get_legend().remove()
ax1.set_title('gamma')
scp.pl.plot_volcano(ax2, tau,threshold=5,xlim_l=-2, xlim_r=2,ctrl_label='negative_control')
scp.pl.label_sensitivity_hit(ax2,tau,'GENENAME',threshold=5,size=5,size_txt=6,t_x=-.6,t_y=.2,ctrl_label='negative_control')
ax2.grid(False)
ax2.get_legend().remove()
ax2.set_title('tau')
scp.pl.plot_volcano(ax3, rho,threshold=5,xlim_l=-20, xlim_r=20,ctrl_label='negative_control')
# label_resistance_hit(ax2,rho,'TTC1',size=5,size_txt=6,t_x=.1)
# label_sensitivity_hit(ax2,rho,'CLHC1',size=5,size_txt=6,t_y=.1)
ax3.grid(False)
ax3.get_legend().remove()
ax3.set_title('rho')
for ax in [ax1,ax2,ax3]:
for item in ([ax.title, ax.xaxis.label, ax.yaxis.label] + ax.get_xticklabels() + ax.get_yticklabels()):
item.set_fontsize(6)
plt.tight_layout()
# Show the plot
plt.show()
@yeroslaviz if run into any issues or you need more help, feel free to email me at abeaATarcinstituteDOTorg
in my data set we are doing some comparisons of multiple time points. When writing the needed comparisons in the config file we have multiple rho comparisons we need to do.
After the run is done we get one volcano plot.
How do we need to understand the results of this plot? Does it belong to only one of the
rho
comparisons or it is somehow a combination of all of them?I know it is possible to run multiple
rho
comparisons within one screenProcessing run. But does it make sense in term of the results I get?The two parts of the config file are attached below.
I would appreciate it, if you let me know.
thanks
Assa