Closed sanchit-saini closed 4 years ago
This is a great start. What do you think about writing some barebones documentation and a couple simple tests with each code submission?
Yes, this sounds good
Great progress. Left a few minor comments.
hey @lawremi
I think TrackHub class is complete Now, I'm working on TrackHubGenome class, parsing of genome.txt can be done with the getGenomesContent
function.
After implementing all the supporting functions of TrackHubGenome class I'll write unit tests and documentation for them.
And I was playing with TrackHub class and I think it misbehaving and root cause for this is uriExists
function.
When TrackHub class is called with URI of the non-existing directory along with create option as TRUE it throws an error.
> TrackHub("/home/sanchit/test", TRUE)
Error: $ operator is invalid for atomic vectors
> traceback()
3: uriIsLocal(x)
2: uriExists(uri)
1: TrackHub("/home/sanchit/test", TRUE)
This issue is also present in Quickload class as it also uses uriExists
function
I think this can be fixed
https://github.com/lawremi/rtracklayer/blob/0519c8f11c5b83375767b5b41b8ec68c4b172e17/R/io.R#L263
with
if (uriIsLocal(uri)) {
> ql <- Quickload(system.file("tests", "quickload", package = "rtracklayer"))
> ql$Error in names(x) :
call to standardGeneric("genome") apparently not from the body of that generic function
traceback() 10: names(x) 9: names(x) 8: grep(pattern, values, value = TRUE) 7: findExactMatches(pattern, values) 6: findMatches(pattern, names(x)) 5: .DollarNames.default(object, pattern = sprintf("^%s", makeRegexpSafe(suffix))) 4: .DollarNames(object, pattern = sprintf("^%s", makeRegexpSafe(suffix))) 3: specialOpCompletionsHelper(op, suffix, prefix) 2: specialCompletions(text, spl) 1: .completeToken()
By changing definition of setMethod to
setMethod("names", "Quickload", function(x) genome(x))
It works as expected but I don't why and I'm not even sure this correct or not
Please look into this and also review the updated changes. Thanks :)
Thanks for finding the uriExists()
issue. Please just add the fix to this pull request. Your fix for the autocomplete issue is correct. Passing genome
directly to setMethod()
confuses the underlying methods machinery. My comments should be in the review.
Thanks for finding the
uriExists()
issue. Please just add the fix to this pull request. Your fix for the autocomplete issue is correct. Passinggenome
directly tosetMethod()
confuses the underlying methods machinery. My comments should be in the review.
I am unable to find these comments. I've looked in the review tab.
I think you should be able to see them under the "Files changed" tab. They're inline comments on the diff.
I've also looked under the "Files changed" tab but still unable to find comments for the last commit. Only first commit "add TrackHub Class" have comments.
It hit the Review button so now stuff should be obvious. Sorry, I'm not too experienced with this review functionality.
Hey @lawremi I'm having trouble in parsing and storing trackDb.txt I'm using this file for reference http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt along with UCSC docs https://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html How should I store this data? I think we need a tree-like data structure for storing this. But I'm not sure how should I implement this. Any pointers would be helpful. And also can you please review the updated changes.
A tree probably makes sense. You could define a class called "TrackContainer", and it could contain a List derivative (like "TrackList"), with the @elementType
being the class union of "Track" and "TrackContainer".
trackDb is represented by TrackContainer
class.
TrackContainer
assumed trackDb file should be indented with tabs(\t
) otherwise It will not work properly.
TrackContainer(
trackList = list()
childrenList = list()
parentLookUp = list()
)
This is how above tree should look after parsing:
trackList[1]$A
trackList[2]$X
childrenList[["A"]][[1]]$B
childrenList[["A"]][[2]]$C
childrenList[["C"]][[1]]$D
childrenList[["X"]][[1]]$Y
parentLookUp[ "A/B", "C/D", "X/Y" ]
parentLookUp
could be used for retrieving any nested children node
, First search over the parentLookUp
for a parent
of the node
after obtaining a parent
we can simply search the element inside childrenList[[parent]]
.
I'll update this specific part of the code with comments which would be helpful for maintainability and understanding logic in the future.
@lawremi Can you please review updated changes.
@lawremi Does this implementation look fine?
Submitted a review. Please make sure to add tests ASAP.
Submitted a review. Please make sure to add tests ASAP.
yes, I will.
Currently, I'm working on getTrackDbContent()
once it's get completed I will push the code.
@lawremi can you please review the updated changes
For now TrackHub class load trackHub repository and represent the properties as a object.
It support these methods :
genome(x)
: Get the URI pointing to the TrackHub repository.length(x)
: number of genomes in the repositoryuri(x)
: Get the URI pointing to the TrackHub repository.E.g