open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.15k stars 859 forks source link

Unexpected behavior of DataType.getSize in Java #6366

Open malinkallen opened 5 years ago

malinkallen commented 5 years ago

I'm using Open MPI 3.1.3 (installed from a source code tarball).

According to the Javadoc of Datatype, getSize is the Java binding of MPI_TYPE_SIZE. However, getSize returns number of elements buffered, while MPI_TYPE_SIZEreturns the number of bytes occupied by the data type.

jsquyres commented 5 years ago

That appears to be exactly what the Java code is doing (returning the number of elements):

https://github.com/open-mpi/ompi/blob/ead2efb13683abd87fdbf1fd1775c4a71d55ca80/ompi/mpi/java/java/Datatype.java#L217-L228

Which Javadoc are you referring to?

The comment right above the function says:

    /**
     * Returns the total size of a datatype - the number of buffer
     * elements it represents.
     * <p>Java binding of the MPI operation {@code MPI_TYPE_SIZE}.
     * @return datatype size
     * @throws MPIException Signals that an MPI exception of some sort has occurred.
     */

...which seems to contradict itself:

  1. It says it returns the "total size of a datatype - the number of buffer elements it represents". This line is actually ambiguous: 😦
    1. Is the "-" just a point of punctuation, or is it a minus sign?
    2. What exactly is a "buffer element"? Let's assume it means "elements" in the MPI sense of the word, but still, it was an odd choice here.
  2. But then the 2nd line says "Java binding of the MPI operation MPI_TYPE_SIZE, which -- you're right -- returns the number of bytes (vs. the number of elements)

I'm inclined to go with the latter (i.e., it's supposed to be the Java binding for MPI_TYPE_SIZE), but I wanted to check if you were sourcing some other documentation.

kawashima-fj commented 5 years ago

Hmm.... Open MPI Java binding is based on mpiJava. mpiJava 1.2.5 already has the comment/code.

  /**
   * Returns the total size of a datatype - the number of buffer
   * elements it represents.
   * <p>
   * <table>
   * <tr><td><em> returns: </em></td><td> datatype size </tr>
   * </table>
   * <p>
   * Java binding of the MPI operation <tt>MPI_TYPE_SIZE</tt>.
   */

  public int Size() throws MPIException {
    if(baseType == OBJECT || baseType == UNDEFINED)
      return displacements.length;
    else 
      return size() / baseSize ;  
  }

According to http://www.hpjava.org/reports/mpiJava-spec/mpiJava-spec/node32.html , it counts elements (in the MPI sense) by design. I don't know the intent.

int Datatype.Size()

returns: datatype size

Returns the total size of the type. Java binding of the MPI operation MPI_TYPE_SIZE. Size is defined as the total number of buffer elements incorporated by the data type, or equivalently as the length of the displacement sequence. Unlike other language bindings, the size is not measured in bytes.

jsquyres commented 5 years ago

Are we missing Datatype.Size(), then? (i.e., that returns bytes, not number of elements)

kawashima-fj commented 5 years ago

Oh, I forgot to mention. Datatype.Size() in mpiJava is renamed to Datatype.getSize() in Open MPI Java bindings in e4e3e411fc52b7427f7b892fa40c2605224311d9. mpiJava and Open MPI Java bindings never has a method to return the number of bytes.

jsquyres commented 5 years ago

So what do you think we should do here? It would seem like a bad idea to change the behavior of an existing binding. It feels like we should

  1. Fix the docs to reflect that it returns number of elements / is not the binding to MPI_TYPE_GET_SIZE.
  2. Add a new method that is the binding to MPI_TYPE_GET_SIZE

How does that sound?

kawashima-fj commented 5 years ago

I'll look other methods to confirm it is the solution consistent with the entire API set.

ggouaillardet commented 5 years ago

FWIW, Siegmar Gross reported a related issue at https://www.mail-archive.com/users@lists.open-mpi.org/msg32525.html

Bottom line, I do not know the rationale for this, and I am not comfortable with it. For example, if we MPI_Type_create_resize(), the size will be in bytes and no more in elements. createResized() expects bytes and not elements, and as pointed by @kawashima-fj, there is no way to get the number of bytes of a given type ...

Java bindings are not part of the standard, and as far as I am concerned, I would not mind changing the current behavior from Open MPI 5 (it might break some codes, but it could also fix some other codes ...)

malinkallen commented 5 years ago

I downloaded openmpi-3.1.3.tar.gz from https://www.open-mpi.org/software/ompi/v3.1 Januari 23. After compiling the code, I found some documentation that I accessed via path-to-openmpi-installation/share/doc/openmpi/javadoc-openmpi/index.html. That's the documentation I'm referring to.

A binding to MPI_TYPE_GET_SIZE would be great, but I don't whether it's best to change the existing method or create a new one (and change the documentation of the existing one).

jsquyres commented 5 years ago

I assume the number of Java OMPI User’s is small, but I’m still quite hesitant about breaking existing apps - even if we think the behavior of the existing getData is wrong.

kawashima-fj commented 5 years ago

I looked at the Datatype class. The history is the following.

  1. In mpiJava, users don't need to count bytes, but need to count base types (char, int, float, ... in Java). For example, the array_of_displacements parameter of the Hindexed method is measured in base types, not bytes. Therefore Size and Extent methods count base types, not bytes. These are explicitly explained in its API Specification, and the design is consistent.
  2. When mpiJava was first integrated to Open MPI in 47c64ec8379cfea820709eae13dc07dbc7af8763, the design was not changed.
  3. Open MPI Java bindings were substantially changed in e4e3e411fc52b7427f7b892fa40c2605224311d9. By this change, users need to count bytes. For example, the array_of_displacements parameter of the createHindexed method was changed to count bytes. But getSize, getExtent, getLb, ... methods were not changed to count bytes. I think it is inconsistent.

@jsquyres The commit e4e3e411fc52b7427f7b892fa40c2605224311d9 is yours. Do you remember why you changed createHvector, createHindexed, createStruct, and createResized but didn't change getSize, getExtent, getLb, ...?

If we want to keep compatibility, I agree with @jsquyres's suggestion. However getSize is not the only problematic one. We need to add new methods corresponding to the following existing methods.

ggouaillardet commented 5 years ago

@kawashima-fj thanks for digging into this ! Now I understant the rationale of the initial design :-)

In C or Fortran bindings of MPI, derived datatypes have two roles. One is to allow messages to contain mixed types (for example they allow an integer count followed by a sequence of real numbers to be passed in a single message). The other is to allow noncontiguous data to be transmitted. In mpiJava the first role is abandoned. Any derived type can only include elements of a single basic type.

My interpretation is we do not follow the initial design any more. The introduction of the createStruct() method evidences my claim since we now allow a Java derived datatype to mix several datatypes.

/**
         * The most general type constructor.
         * <p>Java binding of the MPI operation {@code MPI_TYPE_CREATE_STRUCT}.
         * <p>The number of blocks is taken to be size of the {@code blockLengths}
         * argument. The second and third arguments, {@code displacements},
         * and {@code types}, should be the same size.
         * @param blockLengths  number of elements in each block
         * @param displacements byte displacement of each block
         * @param types         type of elements in each block
         * @return new datatype
         * @throws MPIException Signals that an MPI exception of some sort has occurred.
         */
        public static Datatype createStruct(int[] blockLengths,
                        int[] displacements, Datatype[] types)
                                        throws MPIException

I can see two ways of moving forward (both have the potential to break current codes) with a consistent design

ggouaillardet commented 5 years ago

@siegmargross your feedback is more than welcome !

jsquyres commented 5 years ago

@kawashima-fj The e4e3e411fc52b7427f7b892fa40c2605224311d9 commit reflects the Java redesign from Oscar Vega-Gisbert and Jose Roman -- I was just the committer (this was back in SVN days, and was before we gave Oscar direct SVN commit privileges). Meaning: this was the entire 2nd generation of the Java bindings, and reflected significant redesign work by Oscar and Jose.

If we have a handful of functions that were missed in this redesign, it seems like we have an inconsistent interface. My $0.02 is that we should probably document these functions as inconsistent (e.g., remove confusing text about "this is the Java binding for MPI_TYPE_SIZE" (because it isn't), and make new methods that are consistent with what we need.

kawashima-fj commented 5 years ago

@jsquyres Thanks for your explanation. I totally agree.

More information:

I found a paper on Java bindings in Open MPI. It seems to say getSize() and getExtent() count bytes. If so, it's a bug of the implementation.

From "3.7. Derived datatypes":

When working with derived datatypes, it is often handy to query about the size of the datatype (number of bytes of the data in a message that would be created with this datatype), with getSize(), and also about extent information, with getExtent() (and getLb() for the lower bound).

As @siegmargross said, getExtent and getLb methods count in elements but lb and extent parameters of createResized method counts in bytes. It's very strange.

Comm.allToAllw() and Comm.iAllToAllw() were introduced in 7548e342dd208dd0f9e4522e431d6a6dbe660f53 and their sDisps and rDispls parameters counts bytes.

ggouaillardet commented 5 years ago

I read the documents by Oscar and noted

All that being said, my conclusion is the intent of the Open MPI Java bindings is to "map" the C bindings, and we should hence always count in bytes and not in elements (a direct consequence is that the snippet in Figure 17 is wrong).

I understand that would be a major change compared to the current implementation, and it will break existing applications.

An other approach, which is more friendly to existing apps could be to "hack" the design, and states we always count in elements. If a derived datatype is made of more than one predefined datatypes, then the element is MPI_BYTE. We would have to change the definition of getExtent() and states it returns a number in elements (and not in bytes) - this is what the current implementation is doing by the way - and fix the example in Figure 10. We could introduce the new getBaseSize() method that returns the number of bytes of the element used to build the datatype (I suggest this name because it maps directly to what is implemented under the hood, and there is no equivalent C binding). We would also have to fix createResized() so the lb and extent parameters are in elements and not in bytes.

The other approach is IMHO wrong, but could be preferred until Open MPI 5 since it is less likely to break any existing app. From Open MPI 5, we could start doing the (IMHO) right thing and count in bytes everywhere the C bindings count in bytes - and that is equivalent to an ABI change that will break existing apps.

malinkallen commented 5 years ago

I'm not sure I fully understand this discussion, so I hope you don't mind a possibly stupid question. Let's assume that I define a data type in terms of MPI.DOUBLE:s

DataType type1 = Datatype.createHVector(count1, length1, stride1, MPI.DOUBLE);

and then define a new type in terms of type1:

DataType type2 = Datatype.createHVector(count2, length2, stride2, type1);

Right now, stride2 is the number of bytes between the start of each block, but what happens if you change it to the number of elements? Will stride2 then be the number of MPI.DOUBLEs between the start of each block, or the number of type1:s? I need to be able to specify the stride in terms of either bytes or MPI.DOUBLE:s because the stride of type2 is not always a multiple of the size of type1.

ggouaillardet commented 5 years ago

@malinkallen currently, stride is in bytes.

If we decide to use elements instead, then MPI.DOUBLE, type1 and type2 will have the same element which is MPI.DOUBLE. So if we decide to use elements, stride2 will be in MPI.DOUBLE.

jsquyres commented 5 years ago

Another option would be just go to "all in" on both approaches:

Have both APIs have common prefixes -- e.g., getSizeBytes() and getSizeElements(), or somesuch.

I.e., they'd both be "all new" sets of APIs, but:

  1. They'd compliment each other
  2. They'd be absolutely clear about what they do
  3. They wouldn't break old apps
  4. We could deprecate the old APIs and remove them over time
ggouaillardet commented 5 years ago

@jsquyres my understanding is we should always count in bytes (and we should have implemented that a long time ago when Java-like struct were introduced by Oscar and Jose) .

I understand your concern about not breaking current apps - or at least the ones that do not hardcode predefined datatype sizes because the current implementation does not provide any (straightforward) way to retrieve this info and some subroutines such as the Java binding for MPI_Type_create_resized() expects bytes and not elements.

That's why I'd rather temporarily add two new countInBytes() and countInElements() subroutines (default is to count in elements, and issue a warning the first time a binding that requires such a count is invoked), and then count in bytes from Open MPI 5 and remove at least countInElements() from Open MPI 6.

jsquyres commented 5 years ago

I'm confused by your proposal.

Why are you talking about count? The issue at hand has to do with size, i.e., getSize().

Are you proposing something like:

  1. In Open MPI v5:
    1. Change getSize() (and friends) to return bytes
    2. Add getSizeInBytes() and getSizeInElements() methods (and similar for other affected methods, as identified by @kawashima-fj)
  2. In Open MPI v6:
    1. Remove getSizeInElements() method (and similar)
  3. In Open MPI v7
    1. Remove getSizeInBytes() method (and similar)
ggouaillardet commented 5 years ago

What I proposed is to keep the existing APIs but have their behavior determined by a global flag. For example, getSize() returns size in bytes or elements depending on the global flag. Here is a possible roadmap

  1. In Open MPI 4, getSize() returns elements by default (e.g. no change) unless the user explicitly called countInBytes()
  2. In Open MPI 5, getSize() returns bytes by default (e.g. ABI change so to speak) unless the user explictly called countInElements()
  3. In Open MPI 6, getSize() always returns bytes, and we remove countInElements() (since end users had enough time to update their apps in Open MPI 5. Also remove countInBytes() since it became a no-op.
jsquyres commented 5 years ago

A global flag is not thread safe. Also, I think overloading the word "count" is dangerous.

What if we change getSize() (and friends) to take a boolean (e.g., returnValueInBytes), and it has a default value.

ggouaillardet commented 5 years ago

Fair enough, so let me elaborate/complement a bit more on my initial idea.

A first step could be to simply add a getBaseSize() method to the Datatype class (so end users can pass arguments in bytes to subroutines that require it such as createResized().

I see that as a band-aid since the current implementation/design is flawed (we should have moved to using bytes everywhere when struct like were introduced long time ago), and fixing that will break existing apps.

To help user transition (read keep their currently working apps work again unmodified with Open MPI 5), we could have a global flag. Such global flag can be set by a MCA parameter, or via an API call that can be invoked only once (before MPI_Init() ? ). The API could be setMetric(value) and value can either be ELEMENTS or BYTES. Or it could be setJavaBindingsSemantic(value) and value can either be MPJ or ompi

Per the discussion, I am now thinking a MCA parameter and no API is likely the best path to move forward.

Adding an extra argument would require some extra efforts to both developers and users now and later since the extra argument is already planned to be removed, that is why I think a new getBaseSize() method + a new MCA parameter is a better way to move forward.

This is my opinion and it is not a strong one.

gpaulsen commented 3 years ago

@ggouaillardet @jsquyres Is this much work to try to get fixed for v5.0.0?