almaan / her2st

Her2 Breast Cancer Project
56 stars 14 forks source link

Discrepancy between spot coordinate meta file and count matrix file #3

Open sangwookbae opened 1 year ago

sangwookbae commented 1 year ago

Dear Alma Andersson,

First thank you for publicly sharing your precious data. I`m currently trying to analyze your BRCA HER2 results (zenodo 4751624) but I keep get errors in that the 'number of spots' in the count files and the coordinate files don't exactly match. For example, your A1 sample seems to have 348 spots in your metadata(coordinate) file but only 346 spots in your count matrix file.
The following table is the number of spots for each sample (A1 to H3) and files(metadata file and count matrix file).

image

About half the samples seem to have such discrepancy between metadata and count matrix with varying degree (e.g. some have metadata-to-count difference of 1 or 2 or sometimes 3).

I tried a number of tsv file reading functions and can't seem to solve this problem. Did anyone else have the same problem? or am I the only one?...

I would very much appreciate it if you could give me any comment on this. Thanks in advance.

OliiverHu commented 1 year ago

Hi. I think I have met the same issue, and I only have these sections (section id = A1(348) B1(295) C1(177) D1(309) E1(587) F1(692) G2(475) H1(613) J1(254)) as well. Do you know where I can find out the rest meta files?

almaan commented 1 year ago

Hi,

sorry for the slow reply @sangwookbae. I think @ludvigla would do a better job explaining this than I.

Best, Alma

ludvigla commented 1 year ago

Hi,

Sorry for the inconvenience but there is a simple explanation for the discrepancy. The expression matrices were filtered to remove spots with 0 observed expression while the spotfiles contain coordinates for all spots detected under the tissue sections. The simplest way to fix this issue is to also remove the empty spots from the coordinate tables.

Cheers, Ludvig