simphony / simphony-common

The native implementation of the Simphony cuds objects
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

Provide methods to query the numbers of items in the cuds container #72

Closed itziakos closed 9 years ago

itziakos commented 9 years ago

For example:


mesh = Mesh()

# return the number of points
mesh.number_of_points()  

# return the number of edges
mesh.number_of_edges()  
nathanfranklin commented 9 years ago

:+1:

ahashibon commented 9 years ago

this is a very useful point, so far user has to iterate to get the size of a container. Though why not CUDS all the way, as in: number_of_objects = some_container["CUBA.SIZE"]
Its not about the interface, but rather about the future interoperability requirements, the interface could be as suggested above, but internally it may be a CUBA based.

itziakos commented 9 years ago

this is a very useful point, so far user has to iterate to get the size of a container. Though why not CUDS all the way, as in: number_of_objects = some_container["CUBA.SIZE"]

Its not about the interface, but rather about the future interoperability requirements, the interface could be as suggested above, but internally it may be a CUBA based.

I am not sure why this will add anything to interoperability. The size of length of container is a property of the container and thus there is not ambiguity. It is also not clear what CUBA based means in this context.

nathanfranklin commented 9 years ago

The size of length of container is a property of the container and thus there is not ambiguity.

I agree.

Its not about the interface, but rather about the future interoperability requirements, the interface could be as suggested above, but internally it may be a CUBA based.

I don't understand as well. By, "internally it may be CUBA based", do you mean the implementation of `number_of_points() can use CUDS-keyords? If that is the case, I don’t understand that because the implementation varies. For example, in our current implementation of the stand-alone Mesh and hdf5-file Mesh (FileMesh) could be implemented as follows:

def number_of_points(): # Mesh
  return len(_points)
def number_of_points():  # FileMesh
  return self._group.points.nrows
ahashibon commented 9 years ago

I am not talking about the implementation really, my comment was maybe too short.

  1. The interoperability is meant in particular for the future and for the outside high level interoperability among different tools and models, not only in SimPhoNy. The question is how can we make sure the information is readily available with out the specific choice of the API or the specific choice of how we call the "number of objects" variable, i.e., whether it is number_of_XXX or size_of_XXX etc. CUDS keywords may provide a standard way to specify this.
  2. There are a number of object types (mesh, with its edges, points, etc, and particles, electrons, etc). Each of these would need its own specific "number_of_XXX" method. It would be nicer to have one CUBA based method to do it. for example: number_of_objects = get (CUBA.SIZE, Mesh()) or number_of_objects = get (CUBA.SIZE, particle container). Though this may seem as a matter of taste, it is really intended to keep the intrinsic API small, while allowing maximum flexibility and interoperability, as third parties for example would only need to call a get method, while the CUDA keywords will be drawn from a standard database.
itziakos commented 9 years ago

I do not see why using keywords improves interoperability while using number_of_XXX does not. If I understand correctly the argument is against having multiple versions for the similar functionality (we only alter the last part). This I think is a valid point (for which I am proposing something below) but there are far better solutions to the problem than using a keywords based api. Please note that having multiple functions with specific behavior instead of a single point of entry with multiple arguments is not a matter of taste it is a necessity in order to write maintainable and testable api. Furthermore, the number of functions and classes in an API is not directly linked to the quality of an API (see for example the Qt api, which is considered one of the cleanest and easiest to use).

Looking at the current api one can notice that there are a group of similar operations that take place inside CUDS containers. We have add_xxx, update_xxx, delete_xxx. There groups of methods really look like they belong into independent objects.

Example

def CUDSItems(object):
      """" Is a generic container for low level cuds components implementing the cuds mapping api.
      """"
      def add(self, key, item)

      def update(self, key, item)

      def __getitem__(self, key):

      def __delitme__(self, key):

      def __iter__(self)

      def itersequence(self, keys):

      def __len__(self)

def Particles(object):
    """ The greatly simplified Particles container class
    """

    def __init__(self, name, data):
          self.name = name
          self.data = DataContainer(data)
          self.items = CUDSItems() or a specialized subclass
          self.bonds = CUDSItems() or a specialized subclass

The above object will allow a more pythonic api

particles = Particles()

bonds = particles.bonds

print len(bonds)
for bond in bonds:
     print bond
     print [particles.items[uid] for uid in bond.particles]

items = particles.items
for coorinate in coordinates:
     items.add(Particle(coordinates=coordinates)

This has actually been partially implemented in https://github.com/simphony/simphony-common/blob/master/simphony/io/h5_particles.py where we use the generic H5CUDSItems object but we convert the operations into the flattened api.

itziakos commented 9 years ago

The above api example (https://github.com/simphony/simphony-common/issues/72#issuecomment-76705040) while more pythonic it is a little more work to implement for the adapters. Because the CUDSItems containers will need to be proxy objects to the internal information stored in the

Another example from TVTK (typed python wrappers for VTK) where the DataSet object shares a large number of similarities with our CUDS containers is to avoid the get. This api still easy to understand and discover.


particles.number_of_particles()
particles.number_of_bonds()

This would be also simple to change since the api stays backwards compatible with our current version of CUDS.

We might even go further and make these as read only properties.


particles.number_of_particles
particles.number_of_bonds
itziakos commented 9 years ago

Closing the issue and moving discussion in the wiki