caltechlibrary / cell-atlas

Cell atlas
https://cellstructureatlas.org
Other
5 stars 2 forks source link

Protein Viewer #12

Closed KianBadie closed 3 years ago

KianBadie commented 3 years ago

Issue to discuss integration of a protein viewer into the Cell Atlas.

KianBadie commented 3 years ago

@coiko I decided to create an issue for this topic now to decrease the back and forth of email. You mentioned that the protein viewer will be accessed through the "Structure" field in the video citation sections for modals. When the protein viewer is opened, did you imagine it taking up the same amount of space that the videos/sliders do, and it being in the same position? Or did you imagine it kind of being like the enlarged sliders in modals, where a new window of its own takes over the modal? Or was it something else?

coiko commented 3 years ago

@KianBadie I like the idea of being similar to the enlarged slider in the modals. Thanks!

KianBadie commented 3 years ago

@coiko The demo pages are up on our testing site! The Mol* and PV demos are available at these URLs:

Like we talked about in our meeting, the viewers are opened by clicking the "Open In Viewer" text next to the pdb in the modal. There is currently no way to reshow the regular modal content once the viewer is opened, but I left it like that because I felt like feeling out the viewer functionality was the main purpose. If you resize the window, there might be some buggy characteristics as well.

As you probably already know, clicking the wrench in the Mol* viewer shows all the viewer settings (like coloring and model type). For the PV viewer, one model type and one color type may be selected together at a time. I included all the built in coloring schemes that I could currently find in PV. Unfortunately, there is no official list of properties to color by. However, I feel like I included the obvious ones (if not all the ones PV has available), and there are a few more properties I will try to check to see if PV will accept them as coloring schemes.

Please let me know if you have any questions or if anything prevents you from comparing the viewers properly!

Edit: I did some looking into the property names. For the residue coloring, there is an option called "num" in the demo. The property description for that is "Returns the numeric part of the residue number, ignoring insertion code." In addition, I feel that the "Is Hetatm" option might have been unclear in the demo as well. The description for that is "Returns true when the atom was imported from a HETATM record, false if not. This flag is only meaningful for structures imported from PDB files and will return false for other file formats." While the "HETATM" property sounds like it is purely derived from the biological side of things, I'm not sure about "num". Does that description sound familiar to you? If not, I can look further into what exactly that property means.

Also, I did not look to deeply for any other coloring properties today like I said, my apologies. From the documentation, it does seem like I covered all of them. It might be because I'm hoping there are more, but I want to see if there are any undocumented properties available by looking into the package more. Unfortunately, I think I will have to return to that next week.

coiko commented 3 years ago

@KianBadie This is great - thanks! I'll play around with them in the next few days and let you know what I think.

KianBadie commented 3 years ago

@coiko To follow up on my message from last week, I am going to go ahead and assume that all the coloring options for PV are covered now. I looked a bit into the implementation of the PV package to see if maybe there were some properties to color by that weren't documented/obvious, however it is starting to feel unlikely that there is anything more than what I included. So I think it should be assumed for now that there are no further PV options and that what is in the demo is what will be available as the default.

coiko commented 3 years ago

Thanks @KianBadie! That sounds fine. Sorry for the radio silence about this. I've been playing with the demos and thinking about the best way to proceed. Mol* is awesome and powerful and has a prettier visualization style, but I think it's probably too complex for the student audience. There's just way too much information there to get lost in. So I think we should go with PV. If we do, how much customization can we do? Can we pick which options to show? Can we rename the options? Could we tweak the color schemes? Can we reverse the zoom direction (at least for me, the mouse scroll button gives the opposite of expected behavior from other sites)?

I'll give some more thought in the next couple days to the minimal set of options we should include.

KianBadie commented 3 years ago

@coiko No worries! I assumed you were giving time to weigh out which viewer fit our needs better. PV it is! To answer your questions: We can pick which options to show. The menu the was included in the PV viewer is actually not built in. The upside to that is that we can style it however we want. The menu I included in the demo was just a quick copy from the demo on the pv site. On the same note of what I just described, we can also rename the options because the menu is created by us. We can tweak the color schemes, as the coloring options have defaults but also take custom color arguments. I am not sure on reversing the zoom, but I hope that we could. That aspect also stood out to me while I was testing things. I will look into that!

Were there other customizations that you had in mind? If it helps, here is a list of demos from the pv site that show some of the customization. These demos also include the code to do them, so replicating them should be feasible. One cool demo that I imagine might be useful is Highlight atom under mouse cursor. I will do some research myself of some other options we have to customize what the viewer can do.

coiko commented 3 years ago

@KianBadie Oh that's awesome! Yes, the hovering over an atom could be very useful. Thanks! I'll think about anything else that might be good to have.

KianBadie commented 3 years ago

@coiko Sounds good! I will go ahead and start experimenting with PV and see how we can tweak the colors and if it is possible to reverse the scrolling. I'm assuming the details of how PV will be integrated on our site, how it will look, and what options are available will be saved for a more detailed discussion in our Friday meeting?

coiko commented 3 years ago

Sounds great. Thanks @KianBadie!

KianBadie commented 3 years ago

@coiko Would you like me to keep the molstar demo on the site to check things out some more? Or are things set and it is safe for it to be removed?

Also, wanted to give a heads up that some I accidentally removed the ability to access the demo viewers when I merged some organization changes today. I will have that fixed soon!

coiko commented 3 years ago

@KianBadie No worries and thanks! And yes, I think it's completely safe to remove the Mol* demo.

KianBadie commented 3 years ago

@coiko Sounds great! The demos are now available again and I will go ahead and remove the Mol* demo.

It's a work in progress, but here are some simple ideas for what the menu could look like. I'll add more as I think of them, but I wanted to post these ones and see if any of them were in the right direction so that I can free up my hands to start some work on officially integrating the PV viewer into the site. Even though it's a very simple menu, it's admittedly tricky to make a mock of something that is completely satisfying! What do you think of the menus?

An important note is that the dropdown menus will look different on Mac since the mocks are using the default dropdown element. So on Mac, the dropdown will have the blue dropdown icon on the right side and be more pill shaped.

coiko commented 3 years ago

@KianBadie These are beautiful! I think the second option, with the transparent background, is perfect. Thanks so much!

KianBadie commented 3 years ago

@coiko I thought that one looked nice too. Glad they worked out!

KianBadie commented 3 years ago

@coiko So they way I am going to implement things, I was going to assume anything with a pdb structure id in the video citation is going to have a viewer to open that pdb. What happens in the scenario where there are multiple pdb's in a structure? Or when there is an EMD code?

coiko commented 3 years ago

@KianBadie good question! I think (at least for now) we should ignore EMD codes. They would require a different viewer and they're less fun for users to play with. As for multiple PDBs, I can't remember an example. I can only think of entries that have multiple EMDs, or a PDB and an EMD (e.g. 4.4:Spirosome), in which case we'd just show the viewer for the PDB. If you encounter one that has multiple PDBs, let me know and I'll figure out a workaround (I can combine them into a single file for display). Thanks!

KianBadie commented 3 years ago

@coiko Sounds good! As for multiple PDBs, I think I was getting it confused with the mixed EMDs and PDBs. We currently don't have an example were there are multiple PDBs, like you said!

KianBadie commented 3 years ago

@coiko On our Friday meeting, we talked about how the structure in PV looked different than Mol. Would you be able to go over the reasons that could be? I tried downloading the PDB file and using that downloaded file as the source, but the difference were still there. One thing that I noticed is that when I switched the Mol viewer structure type from "assembly" to "model" things started to look similar as the PV demo. For reference, here is our demo and the Mol* representations of PDB 6UEW.

Also, holding shift and using left click on PV pans the camera! That was a pleasant surprise. Maybe our "detect right-click" tooltip can advise to use shift to pan the camera.

coiko commented 3 years ago

@KianBadie Woohoo! That's awesome (the panning feature)!

And yes, the different appearance in PV and Mol should be easily fixed. I think they're using different files from the PDB. The one we want (and that Mol probably uses) is called "Biological Assembly 1". (Basically it has the whole functional complex, rather than just the smaller repetitive subunit, which is what's in the default PDB format file.) I'm attaching that file here - see if it does the trick. In many cases, the two files are the same, but there are a few special cases like this one, just to keep us on our toes. 6uew.pdb1.zip

KianBadie commented 3 years ago

@coiko Ah I see! I was able to get things working with that file. Will "Biological Assembly 1" be the file type used for all structures? And is there any difference between the ".pdb" and ".pdb1" file type?

coiko commented 3 years ago

@KianBadie Wonderful! Yes - they will all be that "Biological Assembly 1" download. There are 3 file formats (.pdb, .pdb1, and .cif) that are all opened by the same programs (.pdb1 just indicates that it's the assembly (i.e. there's more than 1 subunit) and .cif is used for larger files) so they should all work with PV, but let me know if they don't and I can convert. I sent you a zip with all the files, if that's easier. (Download link: here). Let me know if you find anything's missing or run into any trouble with anything. Thanks!

KianBadie commented 3 years ago

@coiko Sounds good! I will let you know if I run into any issues. So far I have not, and Tom said they should all work as well so I think we should be ok! And thank you for the zip, that will make things easier. I will let you know if anything else is needed!

Also, the development on the viewer has been thankfully smooth so far! I attached a gif of how things are looking so far. It is an exciting feature to work on!

Gif

viewer-demo

coiko commented 3 years ago

Oh how cool! Thanks @KianBadie!

KianBadie commented 3 years ago

@coiko It looks like PV is having some difficulty working with the cif files (the pdb/pdb1 files are working well though). I've been trying to fiddle with PV to see if there is something I am missing but it is not looking to promising. There is built in functionality for sdf files, but I'm not seeing anything for cif files. Unfortunately, it looks like there is not too much discussion on this in the PV documentation or github issues. you mentioned converting in your earlier comment. Is there a convenient way to convert while still displaying the same thing? Or does converting change the intended model to show? I'll check in with Tom as well. I'm not sure how extensively he knows PV, but I will see if this sounds familiar to him to double check if converting is necessary.

For my curiosity, are cif files formatted differently from a pdb/pdb1 file? Or is it just a bigger pdb file? Because if it is the latter, I'm not sure why yet it doesn't work.

Edit: I asked Tom about the conversions and he said that if PV does not recognize cif natively then it is best to convert it. He said if there were difficulties in converting then we can let him know as well!

coiko commented 3 years ago

@KianBadie Ah, sorry to hear that, but no worries. I'll try converting it tomorrow; I think it should be easy because as far as I know the base structure is the same.

coiko commented 3 years ago

@KianBadie Update: the file conversion seemed to work fine from what I can see, so I just sent you a new .zip with all the files as .pdb1 (download link also here). Let me know if anything doesn't work. Thanks!

KianBadie commented 3 years ago

@coiko When I get around to merging the protein viewers today, are you ok with the pv demo being removed along with it? Or would you like to keep it for some time?

coiko commented 3 years ago

@KianBadie Yes, I think it's great to remove the demo - thanks!

Also, here's the missing PDB file for 3JC8. Let me know if you find (or don't find) anything else! 3jc8.pdb1.zip

KianBadie commented 3 years ago

@coiko Sounds good! And thank you for the file!

There was actually an unfortunate problem with the cif converted pdb files. It seems that PV renders everything fine on my computer using "secondary structure" model. However, when I try to change things to the "volume filling" model, it looks like my browser takes a performance hit and eventually gives up and giving an error message related to rendering context (this would happen even after configuring PV to low quality settings and with no antialiasing). From some research, I'm guessing the pdb in question (6o9j) is too complex for my computer to render quickly so the browser is killing the long process. When I try this with another cif converted pdb (5tcr) things worked ok on my desktop, but not on an iPhone 8. Safari on iPhone 8 would eventually display a message replacing the whole page saying "A problem repeatedly occurred on ", which I feel like makes me believe the performance theory more.

I got in touch with Tom, and he agreed that it sounds like a performance issue. He said it is probably too difficult to re-engineer PV to be more efficient, so he gave three possibilities. In his words in his reply email to me:

  1. Remove the sphere model as an option for the large proteins
  2. Make a modified pdb file that removes the atoms of the protein that you use for just the sphere model. Since the sphere model only shows the outer surface of the protein, you can often get very similar results using fewer atoms. For instance, you could include only the CA atoms in the pdb file. This will give you an approximate surface, but won’t work for the other visualization types. You could also strip out atoms that are at the center of each subunit, but there isn’t a great automated way to do this.
  3. Use Mol* for the large proteins.

For option 2, I just told Tom that "ball and sticks" model was also giving problems, so I will see what he says about that option considering that (since he said that option won't work for other visualization types). I tried playing around with a couple of similar looking model types for pdb 6o9j in Mol*, and things seemed to render ok. Another thing that I wanted to look into that might be important is that the error message for the rendering context is detectable, so we could catch that error and do something in response. However, I am not sure if it alleviates the experience of the performance hit. On my computer, the current tab with the viewer will freeze for a bit, then the whole browser will go black for a second or 2, and then things will appear again with a blank canvas and the error message in the console.

My apologies for this roadblock coming right before we thought it was a wrap for the protein viewer 1.0. With all this in mind, would you still want a merge for today? Or wait until he have a solution for the large PDBs? My guess is the large PDBs might still be viewable on the default "secondary structure" model (depending on the computer). However, I would understand if it is better to hold off because of this issue.

coiko commented 3 years ago

@KianBadie Thanks for this great explanation, and I'm sorry you ran into that problem. I don't think it's worth waiting to merge while we figure this out, so go ahead if you'd like (but if you'd rather wait, that's totally fine, too!).

That particular structure is monstrously big. I'll give some more thought to the best way to handle large PDBs. We could limit the visualization style, or we could show just a piece of the whole structure, or we could just not enable the viewer for problematic cases. Anyway, I'll ruminate on it over the weekend and let you know any thoughts that (I hope) occur to me. Thanks!

KianBadie commented 3 years ago

@coiko No problem and no worries! I'm afraid I might have to take up your offer to wait, because there was actually another unfortunate problem. It looks like GitHub is having some complaints about the file sizes of the pdb files when I try to push them to our repository. I had to delete pdb 3j31 because it was over the file size limit, and I think 6kgx was the one causing the complaint after that. Things could work out if we put things on our main server and call the files from there. Tom said he could do that Monday, and I was thinking a possibility is to use the urls on the pdb website (since PV can fetch files remotely). However, even though it is a minor task to implement that change, I think I should just call it a day and save it for next week since I would prefer to not push changes to our testing site without sitting with them with a clear mind for a bit if the option to wait is there! In addition, I would still have to hide the modal when the viewer is open, which is minor but a similar situation to what I stated above. Again, my apologies for not having the viewer merged today after promising it in our meeting today. I will plan on having the protein viewer merged next week after letting things sit for a bit, hiding the modal, and changing things to use the pdb website urls or our server urls.

Ah I see! Yeah, I think those are all options that are considerable. One other option I thought of today was maybe even putting a warning on problematic rendering options for the large files. Maybe a pop up message could show up on hover on the render options with a "render at your own risk of performance hit" kind of message? And no problem!

Lastly, as a follow-up to solution 2. in my previous comment (where I was not sure how that would apply to "ball and sticks" rendering), Tom told me it would work (functionally) with balls and sticks, but you wouldn’t be able to see all the atoms since they are not in the file. If you want the balls and sticks representation on the large models, he recommends to probably do two views with two pdb files (one with the whole complex and one with just the area or interest).

Again, sorry for all the last minute changes in plans. Hopefully things can be merged in smoothly next week. Until then, hope you have a good weekend!

coiko commented 3 years ago

@KianBadie No worries at all! That sounds great, and it's definitely a good stopping point for the week. We'll figure out how to handle the files next week. In the meantime, have a great weekend!

KianBadie commented 3 years ago

@coiko The protein viewer is now up on our testing site! Again, my apologies for the delays and new problems we found along the way. One interesting thing to note is that the pdb files behave similarly to the video files in that access issues might arise when switching between urls (testing site/main site). This might not be a problem for you since the only place for you to access the viewer outside of building the site and serving locally is the testing site. But I just wanted to get that out there now.

On the topic of performance, I haven't reached any "aha" moments but I have had a new thought. I think one thing that adds to trickiness of this issue is that I imagine it is also dependent on the device being used. I would have to do some more testing, but what if mobile devices have trouble rendering a certain pdb while desktop doesn't have a problem. If that is true for some cases, then it makes the issue a little more unpredictable.

Also, here is a gif of the iOS behavior I was talking about. I thought I would include since you have an android. There is a weird jump that happens when the color/model selector opens up. It might look laggy in the gif, but things are smooth on my actual iPhone (and look cool as well!). Other than that, I think I covered the important bits about iOS in our meetings/this gif: viewer-ios

coiko commented 3 years ago

Thanks @KianBadie Sorry for the slow response, but this is fantastic! It looks really great! I agree with your concern that the performance issues will likely vary by platform. I had an idea about how to shrink the large file sizes in a way that doesn't compromise the biological information too much, but it might take me another day or two to see if it will work. I'll keep you posted on whether that looks promising.

KianBadie commented 3 years ago

@coiko No worries on the response time, I know things like this take time to sit on. And sounds good! I will let you know if I run into any new ideas on this issue as well.

coiko commented 3 years ago

@KianBadie Thanks for your patience on the large-file issue! My idea didn't work out, and after more thought, I think the best solution is just to limit the view options based on size. For larger files, we can offer only "secondary structure" and "backbone" (I think those should be about the same complexity for rendering, but I might be wrong). Also, we may want to further limit options on mobile - maybe we should only show secondary structure for all of them?

Assuming that works, the other issue is the size of the files themselves. Will that be a problem on the main site, or just in the repository for testing? Pulling the files from the PDB runs into the file format problem, since most of those "Biological Assembly" files are .cif format. If we can get a good idea of what size becomes problematic, I can make .pdb files for the larger structures that only contain the coordinates of atoms in the backbone. In a quick test, that seems to cut the file size by about half. (But with that file, we can only display the backbone view.) What do you think?

KianBadie commented 3 years ago

@coiko No problem! And for the time being, I think that is a fine solution to this tricky problem. Maybe we can start with "secondary structure" and "backbone" and see if we still run into any issues. I think it wouldn't hurt to try both on mobile as well and see how things go. What are your thoughts on that? I guess the next thing would be to determine what is qualified as a large pdb. Maybe we could do some testing to find the which pdb breaks the threshold. From a quick look of things now, file size for the pdb files climb gradually from under 1MB to around 10MB until the last three big pdb files, which are 5u3c, 6kgx, and 3j31 (topping out at 200MB). So I can do some investigating to see where the cut off point lands. What are your thoughts on that?

My apologies, I never gave a direct update on the file size issue. On Monday, Tom uploaded all of the pdb files that you sent me to the same server that hosts the main site. And our testing site is getting the pdb files from that server in the same way that we get the videos. So the large file size is no longer a problem for the testing site and will also be accessible on the main site. One catch is that the same cross origin issues come up like the videos if one is flipping between sites.

coiko commented 3 years ago

That sounds great @KianBadie!

Thanks to you and Tom for setting up the files for the testing site. The structures weren't showing up for me this morning for some reason but they look great now (until I make it stick by trying to play with space-filling models of the biggest ones ;) Thanks!

KianBadie commented 3 years ago

@coiko Awesome! I will let you know any important findings as they come.

No problem! And please let me know if that problem happens again and I will investigate. It is possible that there could be a bug that prevents the files from being accessed in certain situations!

coiko commented 3 years ago

@KianBadie I played around a bit with a potential icon. What do you think about either of these (or something else)?

PVicon_box PVicon_nobox

coiko commented 3 years ago

@KianBadie And one more...with the box but horizontal as we discussed. Let me know how it works. Thanks!

PVicon_box_horizontal

KianBadie commented 3 years ago

I tested out the viewer icons. I think the horizontal ones are clearer than the vertical one. Seeing them in the context of the modal, how do you think they look? I'm wondering if the video citation element is too small for the image to be clear on what it is. Do you think the buttons could be bigger in the video citation element?

Screenshots

vertical horizontal 3d-horizontal

coiko commented 3 years ago

I think it looks great! Thanks @KianBadie! I agree that horizontal is better, and I like the boxed version (your last screenshot) best of all. I might just move it slightly farther to the right, maybe three times as far from the PDB link as it is now, so it's clearer that it's separate. We'll highlight it in an updated intro video for the site (along with the other awesome features you've added), so I think it will be clear what it does. Maybe at some point for super-accessibility we could also add a hover tip that would say something like "interactive viewer". Of course, the user can also just click it and find out!

KianBadie commented 3 years ago

@coiko For sure! I will go ahead and merge that small update into the site and we can further see how things look in action! I think the hover tip is also a great idea!

KianBadie commented 3 years ago

@coiko I just merged the icon into the site! To make the display of the video citation elements look nice with the protein viewer button, I had to make some small changes. A side affect of that is that the "Diversity" subsection modal has different line breaking in the video citation (things are more centered). From my memory, this is the only edge case with the all the others being the typical doi/structure dois, and I thought that those looked good. I figured everything still looked decent enough to throw it in today. If you think anything should be tweaked more or changed back please let me know!

New

new

Old

old

coiko commented 3 years ago

@KianBadie The icon looks fantastic! And this crazy list looks even better than before. Thanks!

KianBadie commented 3 years ago

@coiko I have ran into some interesting findings on PV. Up until now, I have been using the latest release of PV (v1.8.1). However, I learned that there is a v1.9 that is not labeled as a release, but is available to download. One of the additions on this version was changing the way the spheres model is rendered. Switching to this version did indeed allow sphere models to render more without crashing on desktop. Even on the largest file size pdb 3j31, my computer was able to render it with spheres (my computer, while a little old, still seems like it has more power than an average computer). Unfortunately, the Mac I tested things on was not able to render 3j31 at all (any model). That is our biggest pdb however (almost 2x bigger than the 2nd biggest if I remember correctly), so hopefully that is not the norm for other proteins on desktop. Besides spheres modeling on dekstop, everything else rendered about the same.

If we do want to use this version, here are some other noticeable difference/addition:

There was another interesting finding that I found. While I knew about the viewer configuration options to help with performance (quality and antialiasing), I overlooked that there are also config options to pass on a rendering type level. Spheres has sphereDetail option. Balls and sticks has arcDetail and a sphereDetail arguments. I am going to test how tweaking these works out and if they can help with increasing the range of devices that can support different modeling types.

Update: To follow up, I did some testing at the lowest settings. Visually, things do indeed look like a different quality. However, I was able to get balls and sticks modeling working for a pdb like 6o9j on desktop and a Pixel 3 phone. I attached a screenshot below of balls and sticks at very low settings.

While all of this doesn't provide any concrete or absolute solutions, I thought it might be useful to keep in mind as we continue to figure this issue out! In addition, there are config options to get something in between what we had and the very low settings I used. So I suppose these findings go to show that we can trade off visual detail in amounts that we choose in order to (hopefully) gain more device support.

Screenshot

bas

KianBadie commented 3 years ago

@coiko I incorporated an initial version of the atom highlighting and have attached a gif of it below. What are your thoughts on it so far? Do you think any particular styling could enhance it? I defaulted to a red highlighting like PV's demo. However, on the rainbow coloring, the highlighting doesn't show on other red structures. Should the highlight be an outline maybe? I can see if PV can do that.

Gif

viewer

coiko commented 3 years ago

@KianBadie Thanks so much for looking into all this! The new version sounds like an improvement. I'm not sure it's worth keeping the ball-and-stick option for large files, though. It's cool that it works, but it looks pretty confusing to me at the lower resolution.

The residue highlighter looks fantastic! I love the styling! And I think your idea of of an outline instead of a color would be perfect!