I started writing the code that does the transfer learning (loads the model pretrained on LINCS for finetuning on Trapnell) and ran into a stumbling block: LINCS has 978 genes, one of which isn't part of Trapnell. Further the ordering of the genes betw. lincs_full_smiles.h5ad and trapnell_cpa.h5ad is completely different.
Hence:
Generate a new trapnell_cpa_lincs_genes.h5ad which contains just the genes that are also part of LINCS, with the exact same ordering
Generate a new LINCS_full_smiles_trapnell_genes.h5ad which contains just the genes that are also part of Trapnell, again with the same ordering.
I think generating & storing the datasets is less error prone than trying to fix this in the code.
Can you do this @MxMstrmn? We can also talk about it tmrw.
I started writing the code that does the transfer learning (loads the model pretrained on LINCS for finetuning on Trapnell) and ran into a stumbling block: LINCS has 978 genes, one of which isn't part of Trapnell. Further the ordering of the genes betw.
lincs_full_smiles.h5ad
andtrapnell_cpa.h5ad
is completely different.Hence:
trapnell_cpa_lincs_genes.h5ad
which contains just the genes that are also part of LINCS, with the exact same orderingLINCS_full_smiles_trapnell_genes.h5ad
which contains just the genes that are also part of Trapnell, again with the same ordering.I think generating & storing the datasets is less error prone than trying to fix this in the code.
Can you do this @MxMstrmn? We can also talk about it tmrw.