AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
99 stars 66 forks source link

Rerun transcriptomic dimension reduction module #1397

Closed jaclyn-taroni closed 2 years ago

jaclyn-taroni commented 2 years ago

Breaking out something that's different in #1286 in the interest of better examining those changes.

jaclyn-taroni commented 2 years ago

I will take a closer look at this tomorrow, but I think what's going on here is just changes in the metadata because this hasn't been run in a year. This module joins the metadata to the scores in a step prior to plotting (and those tables are committed to the repository). It would be better if the scores without metadata were saved and then the metadata was joined for plotting, but I think we'd be getting into recreational revision territory if we were change that at this point.

jaclyn-taroni commented 2 years ago

I'll also note that t-SNE values changing is not new (and still a mystery): https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/716

The t-SNE values don't go anywhere (e.g., into a plot), so I think it is sufficient to keep that issue open.

sjspielman commented 2 years ago

I will take a closer look at this tomorrow, but I think what's going on here is just changes in the metadata because this hasn't been run in a year. This module joins the metadata to the scores in a step prior to plotting (and those tables are committed to the repository)

I'll compare to how running the module looks w/ the v19 data which, according to versioning dates, would have been the data most recently used to generate these files.

Edit: Tried the module with v19 and v18 and there are still some substantial diffs for both data releases, including both values and columns (both the existence of certain columns and their order).

sjspielman commented 2 years ago

Noting this will satisfy a checkbox here: https://github.com/AlexsLemonade/OpenPBTA-analysis/issues/1455