nsheff / LOLA

Locus Overlap Analysis: Enrichment of Genomic Ranges
http://code.databio.org/LOLA
71 stars 19 forks source link

Error if GRangesList has names #38

Open j-lawson opened 4 years ago

j-lawson commented 4 years ago

If the GRangesList that is given to runLOLA as part of the regionSetDB has names for each region set, an error results. This can be fixed by setting the names to NULL to trigger the code chunk referenced below. Suggested fix: make the conditional code chunk no longer conditional.

https://github.com/nsheff/LOLA/blob/8262ff58d54ec4e335f71f6d4ce1d3a796a5bfce/R/calcLocEnrichment.R#L60

nsheff commented 4 years ago

I'm not sure why that would happen...you're suggesting throw away all names all the time? This appears to be only setting names if they aren't set.

it would be helpful if you paste the text of the error...

j-lawson commented 4 years ago

Error:

Error in bmerge(i, x, leftcols, rightcols, roll, rollends, nomatch, mult, : Incompatible join types: x.dbSet (factor) and i.dbSet (integer). Factor columns must join to factor or character columns.

When loading the LOLA region databases, the regionDB$regionGRL names are NULL by default so the function works. When regionDB$regionGRL has names that are not NULL the function errors. I think the problem comes from trying to merge based on columns of different classes (the merge in line 175): scoreTable = scoreTable[annotationDT]. In the first case (names are NULL), the class of both columns that are merged is integer. In the second case, the class of one column is integer and the other is factor (the regionGRL names).

Example:

dbPath = system.file("extdata", "hg19", package="LOLA") regionDB = loadRegionDB(dbLocation=dbPath) data("sample_universe", package="LOLA") data("sample_input", package="LOLA") names(regionDB$regionGRL) = paste0("dbGR", seq_along(regionDB$regionGRL))

res = runLOLA(userSets, userUniverse, regionDB, cores=1)