czbiohub-sf / tabula-muris-senis

Tabula Muris Senis
http://tabula-muris-senis.ds.czbiohub.org
BSD 3-Clause "New" or "Revised" License
93 stars 26 forks source link

Mismatch between raw and processed data #11

Closed almog5690 closed 3 years ago

almog5690 commented 4 years ago

Hello, I performed pre-processing to the raw data and did not get the processed data. For instance, I have downloaded the "Bladder_droplet" file from figshare and "tabula-muris-senis-droplet-processed-official-annotations-Bladder" from amazon web. Then I did size factor normalization (with 10000 counts per cell) and log-transformed on the count's matrics in the raw data and it didn't match with the processed data. Specifically in the first cell ("AAACCTGAGTACGTTC-1-24-0-0") on genes "Snhg6" and "Tram1" there's 1 UMI and after I did the normalization I got 0.45 in those genes and in the processed data (from figshare) there's 0.96 and 0.71 in the above genes (respectively). So my question is, did I do something wrong or there's a problem in the processed data?

aopisco commented 4 years ago

@almog5690 the data in figshare is currently out-of-sync with AWS. I'm working on getting that updated and will let you know here once that happens. In the meantime all the processed files are also available from AWS

aopisco commented 3 years ago

@almog5690 this problem should be solved now so I'm closing the issue but feel free to reopen if otherwise