bm2-lab / scMVP

MIT License
28 stars 11 forks source link

Data Availability #9

Open marianogabitto opened 2 years ago

marianogabitto commented 2 years ago

Hi, is it possible to request the data from a different source than the Baidu Cloud. It is inaccessible.

Thanks

adamtongji commented 2 years ago

The sciCAR demo have been uploaded to my dropbox folder. Could you check if it is accessible?

https://www.dropbox.com/sh/p0wyyeuutzw9je9/AAAD362HYPXDGrjgltXpYmHza?dl=0

The input format is same across all demo datasets. And you may test your code with the smallest sciCAR cell line dataset.

marianogabitto commented 2 years ago

Hi, I was able to download it . Would it be possible to request the two additional toy datasets that you use at the beginning of the paper ? The snare-seq and the paired one?

Thanks a lot !

marianogabitto commented 2 years ago

One more thing, do you have the raw data ? Because you copied normalized versions of it ! Thanks !

adamtongji commented 2 years ago

Hi, The raw data fastq and count matrix of these datasets could be downloaded from GEO accession ID of the origin data paper.

And scMVP should take top DEG scRNA matrix and TF-IDF normalized or binary scATAC matrix as input. More scATAC peaks and normalized scATAC input would both improve the performance of latent embedding and imputation in scMVP.

The datasets exceeds the limit of my dropbox account, and I upload the other two cell line datasets to the google drive as the following link: https://drive.google.com/drive/folders/18ymTLyMb_wD20O4Z2qkOXBQt5yoDTvea?usp=sharing

Citugulia40 commented 1 year ago

Hi, I want to ask you that, how did you generated the "sciCAR_cell_annot.txt" file. I have the barcodes, features and count matrix from both scRNA and scATAC, how can I get the annotation file?

Thanks

adamtongji commented 1 year ago

Hi, I want to ask you that, how did you generated the "sciCAR_cell_annot.txt" file. I have the barcodes, features and count matrix from both scRNA and scATAC, how can I get the annotation file?

Thanks

You can download the "sciCAR_cell_annot.txt" file directly from the demo dataset folder in the baidu cloud disk or the dropbox folder link(https://www.dropbox.com/sh/p0wyyeuutzw9je9/AAAD362HYPXDGrjgltXpYmHza?dl=0).

EddieBio commented 8 months ago

Hi,

I found that there is no TF-IDF code in your repository. Should we process it by ourselves in advance?

adamtongji commented 8 months ago

Hi,

I found that there is no TF-IDF code in your repository. Should we process it by ourselves in advance?

The scATAC profiles in the demo datasets are preprocessed for TF-IDF (normalized) using Seurat.

We have compared the performance using raw scATAC binary count or TF-IDF transformed scATAC profile, and found consistent (a bit) higher accuracy with TF-IDF transformed scATAC profile. If you apply our tool to your own raw count scATAC data, we suggest to perform TF-IDF for scATAC data in advance.