AlexsLemonade / scpca-nf

scpca-nf is the Nextflow workflow for processing Single-cell Pediatric Cancer Atlas Portal data
BSD 3-Clause "New" or "Revised" License
12 stars 2 forks source link

Improve scanpy compatibility #774

Closed jashapiro closed 2 months ago

jashapiro commented 3 months ago

Closes #773

I'm filing this as a draft because I have not yet really tested it, but I wanted to get something up while it was fresh in my mind.

The goal here is to improve the compatibility with scanpy as described in #773, so I have done 4 main things:

The last step involves exporting the variance explained data and then importing that separately, with the assumption that the PCA was centered and highly variable genes were used. I could (and probably should) make those assumptions into arguments for the script just to be safe and maybe a bit future-proof.

There are probably also a few places where I am making other assumptions that I should check more explicitly. As I said, a draft...

I also renamed the script to be a bit more generally named, but annoyingly there were apparently enough changes that GitHub is not displaying it as a rename but as a deletion and recreation. Probably because it was run through a code formatter automatically.

jashapiro commented 2 months ago

Test run here completed successfully, and at first examination the outputs are as expected.