Feedback: Python API Design

BradyAJohnston commented 9 months ago

As of earlier this year, bpy can now be installed from pip and is officially supported and maintained by the Blender developers.

Because of this, molecularnodes can also now be installed via pip. While it is possible to do so, I have to emphasize that the API is definitely NOT STABLE, nor is it well documented at this point, but if you are curiosa and wish to try and poke around I welcome it.

When originally creating the add-on, I vaguely had in mind trying to create a nice python API so that a user could write scripts with MN. I have no experience however in doing so, and the requirement for it to also function as a blender add-on means that some things are slightly more complex than they need to be.

I would like to try and target having a stable API that is well documented, so that users can render via pure python scripts, for usage in jupyter notebooks or just general automated rendering powered by Blender & Molecular Nodes. All of this can also be combined to be an interactive jupyter notebook like in this example:

https://github.com/BradyAJohnston/MolecularNodes/assets/36021261/9527e46a-4cf5-4cd0-b9c2-514759bcb18c

The current structure of the package is as follows:

molecularnodes/
├── __init__.py
├── assembly
│   ├── __init__.py
│   ├── cif.py
│   ├── mesh.py
│   ├── mmtf.py
│   └── pdb.py
├── assets
├── auto_load.py
├── blender
│   ├── __init__.py
│   ├── bones.py
│   ├── coll.py
│   ├── nodes.py
│   └── obj.py
├── color.py
├── data.py
├── io
│   ├── __init__.py
│   ├── bcif.py
│   ├── cellpack.py
│   ├── density.py
│   ├── dna.py
│   ├── load.py
│   ├── local.py
│   ├── md.py
│   ├── mda.py
│   ├── pdb.py
│   └── star.py
├── pkg.py
├── ui
│   ├── __init__.py
│   ├── func.py
│   ├── node_info.py
│   ├── node_menu.py
│   ├── ops.py
│   ├── panel.py
│   └── pref.py
└── util
    ├── __init__.py
    └── utils.py

The ui can basically be ignored, as this is only used for the add-on.

I've tried to bundle all of the functions for interacting with Blender's API into blender, and all of the functions for parsing the different data formats into io in their only submodules.

To load a structure from the wwPDB:

import molecularnodes as mn
mol = mn.io.pdb.load('4ozs', style = 'cartoon')

This creates an object in the Blender scene, that has the structure from '4ozs', with a Geometry Nodes tree and the required nodes for taking that structure and generating a style, in this case 'cartoon'.

I think what is currently required, is the creation of some classes for each of the different objects. Then the user could use

mol.style('ball_and_stick')

to change the style that is currently being applied, or other similar operations.

A whole submodule for rendering will also need to be created, that will make controlling the camera, lighting and render settings much easier, but I am unsure what this should look like.

mn.render.frame_object(mol) # align the camera with this object?
mn.render.render_settings() # change the settings?
mn.render.render_image() # renders an image?
mn.render.render_video() # renders an video?

# should this instead be a class?
scene.render_settings(x = 400, y = 500)
scene.render_image()

Open to any and all feedback on what it might look like, hopefully I can get some feedback from those more experience with OOP and API design in general.

jojoelfe commented 9 months ago

I have been using the bpy module with MN in some of my pipelines (following the example notebook in the repo) for a bit now and find it definitely super useful! As such this effort is very much appreciated on my end.

I think a class approach would be nice for setting up multiple cameras, like so:

camera1 = mn.render.Camera(position=(100,100,100), dof=20, lens=15)
camera2 = mn.render.Camera(position=(300,200,10), dof=None, lens=35)

camera1.render_image(x=1024, y=1024, render="cycles")
camera2.render_image(x=2048, y=2048, render="eevee")

The crux of course is that in blender the resolution and rendering engine are set for the project and not for every camera. So you have to decide whether you want to hide this or to align your API with what blender does.

I also think that a nice pythonic API for controlling lights, camera, etc. would be useful outside of the context of molecular visualization, so you might think about making it a separate library which then is used in MN.

BradyAJohnston commented 9 months ago

Thanks for the feedback @jojoelfe, I'd be interested to see how you are currently implementing it.

I think I agree that an individual class for each camera if a user wants multiple cameras. I think classes based around objects / cameras / lights seems like the way to go.

The bpy API is verbose and tricky at times, and set up with GUI use in mind, so having everything streamlined would be the way to go. Hiding how things are actually represented (on the scene level) rather than per camera from the user I think is good, all of those things can just be handled behind the scenes for better ease of use.

I have thought about exactly what you are suggesting, having a separate library that act as a wrapper for bpy to enable more streamlined interactions with blender through scripts would be a good idea. I have tried to keep all of my blender-interacting code inside of blender submodule, but it is still all over the place and has lots of room for improvement and formalisation for usability.

BradyAJohnston commented 9 months ago

This of course then requires the most important step, which is coming up with a good package name. Other projects which do a similar thing are EasyBPY.

jojoelfe commented 9 months ago

Here is what I am using currently:

https://github.com/jojoelfe/decolace/blob/main/src/decolace/processing/match_visualization.py

This renders 2D template matches on montages from cell slices. It depends on a feature I have in my fork of MN, that automatically imports the micrograph referenced in a starfile. I will make a PR for that shortly. I am just a bit cautious since this involves some "magic" with events inside of python.

The nice bit about this code is that it produces pngs with the same resolution as the loaded micrograph, which makes adding scale bars later quite straightforward. (Might actually be nice to make a "scalebar" node for MN now that I think about it)

BradyAJohnston commented 7 months ago

The 'first part' of this implementation I've done in #402. All importing is now done via classes internally.

The 3 main metaclasses are Molecule, Density and Ensemble. Trajectories imported via MDAnalysis haven't been altered yet, but they were already class-based.

I wouldn't say that it is all 'consistent'. There is lots of polishing to be done, but all previous capabilities are now implemented and for a general user there should be no change while opening and importing structures etc.

Opening files can be done through:

import molecularnodes as mn
import numpy as np

mol1 = mn.io.CIF('example.cif') # to parse a `.cif` file
mol2 = mn.io.fetch('1cd3')           # to fetch a PDB code 

# the returned 'Molecule' can access the associated 3D object
mol1.object
mol2.object

# the associated AtomArray that the data came from
mol1.array[0:10, :]

# convenience to get and set attributes from the 3D object
# beware that the number of 'atoms' in the 3D model may not correspond
# to the number of atoms in the AtomArrayStack

mol1.get_attribute('b_factor')
mol2.get_attribute('position')
mol2.set_attribute('b_factor', np.repeat(0, len(mol2))

I am a little unsure about the usage of 'get_attribute' and 'set_attribute', as the data columns on the 3D model are called attributes, these might be confused with getting & setting attributes on the class / object itself. I have already been confused a bit myself while implementing it.

jojoelfe commented 6 months ago

I am a little unsure about the usage of 'get_attribute' and 'set_attribute', as the data columns on the 3D model are called attributes, these might be confused with getting & setting attributes on the class / object itself. I have already been confused a bit myself while implementing it.

Yes, that might be confusing. blender_attribute or named_attribute maybe? On a similar note, I sometimes run into trouble because the blender object gets stored in variables named object, which overwrites the builtin python object class.

BradyAJohnston commented 6 months ago

I think that's maybe a good idea, to replicate the node names in Geometry Nodes and use

# option 1
mol.named_attribute_store('b_factor', np.repeat(0, len(mol)))
mol.named_attribute_get('b_factor')

# option 2
mol.store_named_attribute('b_factor', np.repeat(0, len(mol)))
mol.get_named_attribute('b_factor')

Could be a bit verbose, but I am always in favor of being more verbose with function names in the sake of clarity

BradyAJohnston commented 6 months ago

On a similar note, I sometimes run into trouble because the blender object gets stored in variables named object, which overwrites the builtin python object class.

@jojoelfe I changed it to object over obj to avoid clashing with the mn.blender.obj module, but yes I agree it's not ideal. Open to suggestions for changing either

BradyAJohnston / MolecularNodes

Feedback: Python API Design #359