iDrDex / star_api

API access to STARGEO: stargeo.org
2 stars 2 forks source link

ComBat skipping GSE48072 #14

Open panjames opened 8 years ago

panjames commented 8 years ago

If I run the code which specifies ComBat to be run on 4 studies I only get 3 studies back in my samples dataframe. Any ideas on why it's skipping GSE48072?

from starapi import *
import starapi.main as api
import starapi.analysis as analysis
from starapi import conf
conf.configure('./')
import pandas as pd
from easydict import *
import starapi.analysis as run
sample_class = api.get_annotations("""BMI217_schizo_blood_cases =='BMI217_schizo_blood_cases' or \
BMI217_schizo_blood_controls =='BMI217_schizo_blood_controls' \
or tissue=='Tissue' or age =='Age' or sex =='sex'""", \
                                   """BMI217_schizo_blood_cases =='BMI217_schizo_blood_cases' \
                                   or BMI217_schizo_blood_controls=='BMI217_schizo_blood_controls' \
                                   or tissue=='Tissue' or age =='Age' or sex =='sex'""")

import numpy as np
sample_class=sample_class.replace(np.nan, "")
sample_class['schizophrenia'] = sample_class.bmi217_schizo_blood_controls + sample_class.bmi217_schizo_blood_cases
schizo_complete = sample_class.replace("", np.nan).groupby('gse_name')\
.count()["""bmi217_schizo_blood_controls         bmi217_schizo_blood_cases""".split()]

schizo_samples = sample_class.set_index('gse_name').ix[['GSE18312', 'GSE27383', 'GSE38485', 'GSE48072']]

#Perform Combat
combat = analysis.combat(schizo_samples.reset_index(), 'schizophrenia')

#Extract expression and samples from combat object
expression, samples = combat