KevinMenden / scaden

Deep Learning based cell composition analysis with Scaden.
https://scaden.readthedocs.io
MIT License
71 stars 25 forks source link

Scaden paper test data #125

Open jth3galv opened 8 months ago

jth3galv commented 8 months ago

Hi Kevin, sorry to bother you here and misuse GitHub :) but I did not know how to reach you.

I am working with scaden and I need to do some tests. I am struggling to obtain the test data to reproduce the results of your paper.

Would it be possible to share them?

Thanks!

Giulio

WanderingHedgie commented 2 months ago

Hi jth3galv !

I don't know if you're still searching, but I find datasets used for Scaden training on the 10xGenomics website. I made a file to list all of them (names and number of cells match to datasets described in Supplementary Table 1 from the article). I hope it will be useful to you !

Here is the file :

Information about the datasets used in the scaden method

1. General information

All datasets gathered for training are :

According filtered datasets matrices were downloaded from the 10xGenomics website, it allows to avoid some barcodes that should not be in the dataset due to errors.

Notes :

A ready-to-use dataset is available on the scaden website which contains all 4 datasets with 32k simulated data : https://scaden.readthedocs.io/en/latest/datasets.html#human-pbmc.

2. Download datasets for training

2.1 - 6k PBMCs from a healthy donor

Webpage

https://www.10xgenomics.com/datasets/6-k-pbm-cs-from-a-healthy-donor-1-standard-1-1-0

Input Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_fastqs.tar

Output Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc6k/pbmc6k_web_summary.html

2.2 - 8k PBMCs from a healthy donor

Webpage

https://www.10xgenomics.com/datasets/8-k-pbm-cs-from-a-healthy-donor-2-standard-2-1-0

Input Files

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_fastqs.tar

Output Files

wget https://s3-us-west-2.amazonaws.com/10x.files/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_raw_gene_bc_matrices_h5.h5
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_web_summary.html
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc8k/pbmc8k_cloupe.cloupe

2.3 - Frozen PBMCs (Donor A)

Webpage

https://www.10xgenomics.com/datasets/frozen-pbm-cs-donor-a-1-standard-1-1-0

Input Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_fastqs.tar

Output Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_web_summary.html

2.4 - Frozen PBMCs (Donor C)

Webpage

https://www.10xgenomics.com/datasets/frozen-pbm-cs-donor-c-1-standard-1-1-0

Input Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_fastqs.tar

Output Files

wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_possorted_genome_bam.bam
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_possorted_genome_bam_index.bam.bai
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_molecule_info.h5
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_filtered_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_analysis.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_metrics_summary.csv
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_c/frozen_pbmc_donor_c_web_summary.html