hui2000ji / scETM

A generative topic model that facilitates integrative analysis of large-scale single-cell RNA sequencing data.
BSD 3-Clause "New" or "Revised" License
48 stars 8 forks source link

pathDIP data #5

Closed hxy265 closed 1 year ago

hxy265 commented 1 year ago

Hi, really nice package! I am trying to run this with pathDIP data but when i use the download link https://ophid.utoronto.ca/pathDIP/Download.jsp I get files such as CURATE_V4.txt. This appears to be different to pathway_gene_matv4.csv in the vignette. How do i generate this csv from the downloaded files? Thanks!

yifnzhao commented 1 year ago

Hello! Thanks for your interest in scETM. The pathway-gene matrix simply needs to be a binary matrix -- entry (i,j) equals 1 if pathway i includes gene j according to a user-defined pathway database, such as pathDIP; entry (i,j) equals 0 otherwise. Please don't hesitate to follow up if anything's unclear!

Yifan

hxy265 commented 1 year ago

Thanks, is there any chance you can provide this as a reference file? It would make is a lot easier!

yifnzhao commented 1 year ago

Given that data access to pathDIP is regulated by Jurisica Lab, currently I don't think we have plans (or the right permission) to release this dataset in our repository. As I described above, once you download the data, it is very simple to convert it to a pathway-gene binary matrix.

Yifan