lawremi / rtracklayer

R interface to genome annotation files and the UCSC genome browser
Other
29 stars 17 forks source link

WIP TrackHub feature #22

Closed sanchit-saini closed 4 years ago

sanchit-saini commented 4 years ago

For now TrackHub class load trackHub repository and represent the properties as a object.

It support these methods :

genome(x): Get the URI pointing to the TrackHub repository. length(x): number of genomes in the repository uri(x): Get the URI pointing to the TrackHub repository.

E.g

library(rtracklayer)
th <- TrackHub("http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/")
genome(th)
length(th)
uri(th)
sanchit-saini commented 4 years ago

This is a great start. What do you think about writing some barebones documentation and a couple simple tests with each code submission?

Yes, this sounds good

lawremi commented 4 years ago

Great progress. Left a few minor comments.

sanchit-saini commented 4 years ago

hey @lawremi

Please look into this and also review the updated changes. Thanks :)

lawremi commented 4 years ago

Thanks for finding the uriExists() issue. Please just add the fix to this pull request. Your fix for the autocomplete issue is correct. Passing genome directly to setMethod() confuses the underlying methods machinery. My comments should be in the review.

sanchit-saini commented 4 years ago

Thanks for finding the uriExists() issue. Please just add the fix to this pull request. Your fix for the autocomplete issue is correct. Passing genome directly to setMethod() confuses the underlying methods machinery. My comments should be in the review.

I am unable to find these comments. I've looked in the review tab.

lawremi commented 4 years ago

I think you should be able to see them under the "Files changed" tab. They're inline comments on the diff.

sanchit-saini commented 4 years ago

I've also looked under the "Files changed" tab but still unable to find comments for the last commit. Only first commit "add TrackHub Class" have comments.

lawremi commented 4 years ago

It hit the Review button so now stuff should be obvious. Sorry, I'm not too experienced with this review functionality.

sanchit-saini commented 4 years ago

Hey @lawremi I'm having trouble in parsing and storing trackDb.txt I'm using this file for reference http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt along with UCSC docs https://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html How should I store this data? I think we need a tree-like data structure for storing this. But I'm not sure how should I implement this. Any pointers would be helpful. And also can you please review the updated changes.

lawremi commented 4 years ago

A tree probably makes sense. You could define a class called "TrackContainer", and it could contain a List derivative (like "TrackList"), with the @elementType being the class union of "Track" and "TrackContainer".

sanchit-saini commented 4 years ago

tree trackDb is represented by TrackContainer class. TrackContainer assumed trackDb file should be indented with tabs(\t) otherwise It will not work properly.

TrackContainer(
        trackList = list()
    childrenList = list()
    parentLookUp = list()
)

This is how above tree should look after parsing: trackList[1]$A trackList[2]$X

childrenList[["A"]][[1]]$B childrenList[["A"]][[2]]$C childrenList[["C"]][[1]]$D childrenList[["X"]][[1]]$Y

parentLookUp[ "A/B", "C/D", "X/Y" ]

parentLookUp could be used for retrieving any nested children node, First search over the parentLookUp for a parent of the node after obtaining a parent we can simply search the element inside childrenList[[parent]].

I'll update this specific part of the code with comments which would be helpful for maintainability and understanding logic in the future.

sanchit-saini commented 4 years ago

@lawremi Can you please review updated changes.

sanchit-saini commented 4 years ago

@lawremi Does this implementation look fine?

lawremi commented 4 years ago

Submitted a review. Please make sure to add tests ASAP.

sanchit-saini commented 4 years ago

Submitted a review. Please make sure to add tests ASAP.

yes, I will. Currently, I'm working on getTrackDbContent() once it's get completed I will push the code.

sanchit-saini commented 4 years ago

@lawremi can you please review the updated changes