Open RodenLuo opened 16 hours ago
It can be difficult to use Colab to do calculations without the user directly interacting with the Python notebook interface that Colab provides. In fact the Colab terms of use say it is only for interactive use, presumably interactive use through their Python notebook web browser interface, so running a calculation launched by a remote application such as ChimeraX is certainly a gray area. Because of this intent there is no API for controlling Colab. So the ChiimeraX AlphaFold tool launches a web browser within ChiimeraX and starts a Python notebook that has a protein sequence entry field and it injects the protein sequence into the entry field using JavaScript, and then simulates pushing the Run button with some more JavaScript.
I am not sure how you would upload a volume map to a Colab notebook. One idea is ChimeraX writes the map to the users Google Drive. There is some Python API to do that although it requires authenticating to the user's google account and/or maybe using an API key. This a cumbersome process. Another idea is the user has to upload the map using whatever Colab's web browser interface offers for uploading files. I think this is going to be cumbersome for users.
Downloading a result (a zip file in the AlphaFold case) is easy. There is a Colab API to ask it to download a file and if the browser is within ChimeraX it gets the download request and directly receives the file and can write it to disk (for AlphaFold it gets written to ~/Downloads/ChimeraX/AlphaFold/prediction_N).
In addition to the difficulties uploading files my experience over the past 3 years maintaining the ChimeraX AlphaFold Colab capability is that it breaks several times per year, half the time due to updates to the Colab environment which seem to never be announced, e.g. updating from Python 3.7 to 3.9, to 3.10, to ..., or updating CUDA versions unannounced. So the maintenance of such a service is painful because you don't control the virtual machine environment.
A simpler idea might be to simply create a Colab notebook for running DiffFit that does not involve ChimeraX. It would present a form in a web browser to choose files to upload, then compute and download the results. The user can view those results by opening them in ChimeraX. The drawback is if DiffFit is using lot of ChimeraX Python capabilities you won't have access to that in Colab. But that same problem will be present if you are making a ChimeraX Colab service. One way to handle that is to try the ChimeraX PyPi package which your Colab notebook could install and use.
Hi Tom @tomgoddard,
I happen to see the AlphaFold prediction bundle in ChimeraX and am thinking of setting up a similar pipeline. Users without a CUDA GPU will then also be able to run DiffFit rapidly. Before doing so, I wanna understand the feasibility and the best practice. It would be much appreciated if you could share some experiences here, especially for the following questions.
Many thanks!