Open TorHou opened 9 years ago
This would be really great! Let us know what you are needing so we can get RNACentral into Galaxy.
Great, thanks for the examples! I will look into this and post any updates here. Integrating RNAcentral and Galaxy will be very cool. Maybe we can do this for Rfam as well.
I am not sure about the timeline, but I should be able to get the ball rolling soon.
@AntonPetrov this made my day! Thanks!
@AntonPetrov anything we can help you here?
Thanks for checking in @bgruening! We've been busy working on the next RNAcentral release, so this is still on my todo list. I'll be sure to get in touch when I start working on RNAcentral/Galaxy integration.
@AntonPetrov awesome! Would be great to get it into the new version!
@AntonPetrov we are currently discussion further integration and cooperation between ELIXIR and de.NBI and this project would be awesome. Do you see any chance to prioritize it? Should we organise a small hackathon?
Thanks a lot @AntonPetrov!
@bgruening this is a perfect time for this discussion as we just went live with RNAcentral 5 and are starting work on RNAcentral 6 (due in late June). I would be happy to include Galaxy integration in the list of release 6 features.
At this point the most useful thing (for me at least) would be to have some usage scenarios. I can think of several:
Is this the kind of thing you had in mind?
@AntonPetrov the first scenario is exactly what we have in mind. The second scenario is food for thought for the Galaxy Devs .. at least I don't think that Galaxy supports this, correct me if I'm wrong @bgruening
thanks @TorHou! The first scenario can work just fine, although there might be some limitations on the RNAcentral side.
For example, one can export no more than 250K sequences in FASTA/JSON formats because this is the maximum number of entries our Lucene instance is configured to paginate over. This could be resolved but it would require some engineering on our side.
One can do useful things even with 250K sequences so this may be a deal breaker but I thought I'd let you know.
- a user is in Galaxy and wants to import lots of data from RNAcentral (for example, get all human ncRNAs like this: http://rnacentral.org/search?q=TAXONOMY:%229606%22).
This is what we need at first hand and what is hopefully very easy to get running.
- conversely, a user is in RNAcentral and they just ran a sequence search (http://rnacentral.org/sequence-search/?q=URS0000049E57), and now they want to take the data into Galaxy and do a multiple sequence alignment.
Theoretically possible, but we need to figure out how to handle authentication and to which Galaxy instance the user wants to submit the data. So this is a little bit tricky. I could imagine an integration with usegalaxy.org as a proof of concept that requires a running Galaxy session or at least an user account.
I don't see a problem with the 250k limit yet. If we see a huge demand for this we can think about a solution at a later time point. 250k is quite a lot :)
@AntonPetrov how do we want to proceed here?
At this stage I am happy to add this to the release 6 milestone and actually begin work in about a month's time. I don't think a hackathon is really necessary for this but I will be sure to get in touch if I have any questions. On Wed, 13 Apr 2016 at 17:27, Björn Grüning notifications@github.com wrote:
- a user is in Galaxy and wants to import lots of data from RNAcentral (for example, get all human ncRNAs like this: http://rnacentral.org/search?q=TAXONOMY:%229606%22).
This is what we need at first hand and what is hopefully very easy to get running.
- conversely, a user is in RNAcentral and they just ran a sequence search (http://rnacentral.org/sequence-search/?q=URS0000049E57), and now they want to take the data into Galaxy and do a multiple sequence alignment.
Theoretically possible, but we need to figure out howto handle authentication and to which Galaxy instance the user wants to submit the data. So this is a little bit tricky. I could imagine an integration with usegalaxy.org as a proof of concept that requires that you have already a running Galaxy session or at least an user account.
I don't see a problem with the 250k limit yet. If we see a huge demand for this we can think about a solution at a later time point. 250k is quite a lot :)
@AntonPetrov https://github.com/AntonPetrov how do we want to proceed here?
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/RNAcentral/rnacentral-webcode/issues/39#issuecomment-209532084
Fantastic! Thanks a lot @AntonPetrov!
@AntonPetrov please this would be a great feature for both communities!
I am all for it @bgruening, thank you for the reminder! We are well behind our release schedule and this feature keeps being delayed. It is still on my radar though and I will get to it as soon as I can.
@AntonPetrov thanks for stopping by Galaxy-GraphClust poster and interesting discussion. I think now we have interesting use cases :)@bgruening suggested if u r available to meet sometime during ismb. That would be great!
Thanks @mmiladi! I would be very happy to meet with you and @bgruening. Any lunch break works for me or you can catch me at the EBI booth. What time would work for you?
@AntonPetrov we have the Galaxy BOF today during lunch. Maybe at the end of the lunch break or tomorrow?
@bgruening Sorry I missed you after the Galaxy BoF. Tuesday would work for me - or we could catch up during the poster session tonight or tomorrow (B-473/474).
Hi everybody, is there any progress on this? Would be awesome to have a working integration of RNAcentral in Galaxy
@jfallmann Hi Joerg, we'd like to see this integration ASAP ourselves. Currently the ball is on our side, we're in the process of acquiring a read-only public-facing database instance that will contain RNAcentral data and be used by Galaxy team to integrate RNAcentral into it. This instance will also be publicly available for directly accessing RNAcentral data.
It is taking the database administration department a while to address this issue, cause in a way this is the first time they're deploying this flavour of a public database (and arranging infrastructure necessary to maintain it). But we're in touch with them, heard from them recently and we're reasonably optimistic about the timeframe, hopefully a couple of months on our side (fingers crossed - this is a guess, not a promise).
Thanks for your interest and sorry for delays.
Adding a comment just for the record. Our public postgres database is finally there. Here is a documentation page with the connection details and a few recipes:
https://rnacentral.org/help/public-database
Sorry for delays and hope this helps!
As discussed, it would be nice to have RNAcentral talk to Galaxy. :+1:
For the direct communication between Galaxy and remote data sources only a few things have to be added. Front end wise not a lot would have to be done. Just a button should be added when you identify that the incoming Request is from a Galaxy server. Probably the best place to start is the Galaxy wiki on datasources
Then there is the small example that I mentioned, which is a CherryPy server communicating with Galaxy
And there is this repo by @erasche which shows another example of a server talking to Galaxy
Let us know (@bgruening or me) if you need assistance in any way