Closed inofechm closed 3 years ago
Hi, thanks for your query. This code isn't in the repo. For the Pseudobulk, we just summed up all the reads for each gene across all cells. So, it's the basic rowSums() function in R that was applied to the count matrix of each individual tumor.
We have also found that using the raw R2 fastq files as input to a bulk RNAseq pipeline will give comparable results to the count summation method.
Hope this helps
@inofechm I find the related codes are in the ecotypes/generate_pseudobulk_mixture_file.snakemake.R
file. There are two methods mentioned: sum and average. The first one could be what you need.
Can you please direct me to the code used to generate pseudobulk rna-seq from the paper and how to run the pam50 subtyping on the pseudobulk? I see the bulk-rna seq pam50 code but want to apply the pseudobulk method for my own breast samples so that would be appreciated. Thank you