JanLeipzig / ViennaCCCdb

0 stars 1 forks source link

"COMPLEX:" in gene symbol columns? #5

Open dnjst opened 1 year ago

dnjst commented 1 year ago

Should we have COMPLEX: in the gene symbol columns as well as the Uniprot columns?

For using squidpy and chinpy, it expects this, so I wrote this code to update the list:

intercell_vienna = pd.read_csv("https://raw.githubusercontent.com/JanLeipzig/ViennaCCCdb/main/ViennaCCCdb.csv", sep="\t")

# add complexes
intercell_vienna["source_genesymbol"] = ["COMPLEX:" + x if "_" in x and "COMPLEX:" not in x else x for x in intercell_vienna["source_genesymbol"]]
intercell_vienna["target_genesymbol"] = ["COMPLEX:" + x if "_" in x and "COMPLEX:" not in x else x for x in intercell_vienna["target_genesymbol"]]

But if other tools will mess up with it (need to double check others), maybe best to leave out? I think I vaguely remember the cell2cell vectors needing without this string, and maybe LIANA as well.

JanLeipzig commented 1 year ago

If will check if LIANA can deal with it, if yes, we could just add it, otherwise we might have to think.

JanLeipzig commented 1 year ago

Apparently, this causes problems in lianapy. Remove COMPLEX from gene names again!