MartinBaGar commented 2 years ago

Hello ! First of all, thank you very much for your great tool, I haven't been able to use it successfully yet but it seems very helpful. I have been using R since 2 month for an internship , its the first time for me so I hope I won't ask stupid question.

I wanted to perform a DEA analyze for the TCGA-SKCM project but I'm stuck at the TCGAanalyze_DEA function that indicates "Error in edgeR::DGEList(counts = TOC, group = tumorType) : Length of 'group' must equal number of columns in 'counts'". I have been using the TCGAbiolinks vignette, the TCGAworflow vignette and the Case Studies to try to understand but I couldn't make it run properly. Also, I have run the Case Study n.1 on my device and it works perfectly so I think its related to the TCGA-SKCM project.

Here is my script:

query.skcm.full <- GDCquery(
  project = "TCGA-SKCM",
  data.category = "Transcriptome Profiling",
  data.type = "Gene Expression Quantification",
  sample.type = c("Primary Tumor","Solid Tissue Normal")

    query = query.skcm.full,
    files.per.chunk = 100

skcm.exp <- GDCprepare(
    query = query.skcm.full, 
    save = TRUE, 
    save.filename = "skcmExp.rda"

# get subtype information
infomation.subtype <- TCGAquery_subtype(tumor = "SKCM")

# get clinical data
information.clinical <- GDCquery_clinic(project = "TCGA-SKCM",type = "clinical") 

# Which samples are Primary Tumor
samples.primary.tumour <- skcm.exp$barcode[skcm.exp$shortLetterCode == "TP"]

# which samples are solid tissue normal
samples.solid.tissue.normal <- skcm.exp$barcode[skcm.exp$shortLetterCode == "NT"]

dataPrep <- TCGAanalyze_Preprocessing(skcm.exp)

dataNorm <- TCGAanalyze_Normalization(
    tabDF = dataPrep,
    geneInfo = geneInfoHT,

dataFilt <- TCGAanalyze_Filtering(
    tabDF = dataNorm,
    method = "quantile", 
    qnt.cut =  0.25

dataDEGs <- TCGAanalyze_DEA(
    mat1 = dataFilt[,samples.solid.tissue.normal],
    mat2 = dataFilt[,samples.primary.tumour],
    Cond1type = "Normal",
    Cond2type = "Tumor",
    fdr.cut = 0.01 ,
    logFC.cut = 2,
    method = "glmLRT",
    pipeline = "edgeR"

Here is my session info:

Thank you very much in advance ! Of coursen tell me if you need more information.

Best, Martin

tiagochst commented 2 years ago

Hi Martin,

Since samples.solid.tissue.normal is only one sample. You need to set drop=FALSE, dataFilt[,samples.solid.tissue.normal,drop = FALSE], because dataFilt[,samples.solid.tissue.normal,], will return a vector instead of a matrix.

MartinBaGar commented 2 years ago

Thank you so much for your quick answer and for the solution, it's working perfectly ! I love your tool !