haowulab / DSS

14 stars 13 forks source link

Suggest adding sorting to makeBSseqData #11

Open Shians opened 4 years ago

Shians commented 4 years ago

When using makeBSseqData, the merge step can change the ordering of the rows, so you can input two sorted datasets and the merge will output an unsorted one, this then triggers the warning. You can fix this by adding alldat <- alldat[order(alldat$chr, alldat$pos), ] to line 34, directly after the merging step. Otherwise you can set sort = FALSE in merge() but I'm not 100% sure whether that guarantees the original order is preserved properly.

haowulab commented 4 years ago

Thanks a lot for the comment. I'm not sure if this is true: "input two sorted datasets and the merge will output an unsorted one". I never observed this. I felt that when the inputs are sorted, the output will always be sorted, especially I used all=TRUE.

I left the warning message there intentionally to let the user know that the inputs are not sorted. Well, maybe it's not that important.

I did have some codes for sorting the results, as you can see. But I was stupid and looped over the chromosomes. Your way is much cleaner. I'll modify that.

Shians commented 4 years ago

I've been bitten by merge's strange behaviour before, I really don't know what it's doing when it sorts.

x <- data.frame(
    chr = "chr1",
    pos = c(999, 1000),
    N1 = 10,
    X1 = 8
)

y <- data.frame(
    chr = "chr1",
    pos = c(999, 1000),
    N2 = 11,
    X2 = 9
)

merge(x, y, all = TRUE)
#>    chr  pos N1 X1 N2 X2
#> 1 chr1 1000 10  8 11  9
#> 2 chr1  999 10  8 11  9

Created on 2020-06-10 by the reprex package (v0.3.0)