realXtend / tundra

realXtend Tundra SDK, a 3D virtual world application platform.
www.realxtend.org
Apache License 2.0
84 stars 70 forks source link

(De)serialize enum Attributes using the symbol names instead of (u)int values #767

Open Stinkfist0 opened 10 years ago

Stinkfist0 commented 10 years ago

Instead of

<attribute value="0" type="int" id="runMode" name="Run mode"/>

it would be great to be able to see (and write)

<attribute value="RunOnBoth" type="enum" id="runMode" name="Run mode"/>

instead. Backwards compatibility with int/uint enum would be naturally able to be retain..

antont commented 10 years ago

nice idea, though also somehow weird when value="RunOnBoth" type="int"

.. as usually in xml "something" is a string, not int.. but sure it's for enums, perhaps fine in what txml is.

antont commented 10 years ago

this seems to be how some folks serialize c# enums to xml, http://stackoverflow.com/questions/2306299/how-do-you-use-xmlserialize-for-enum-typed-properties-in-c .. there in dotnetland.

antont commented 10 years ago

and in c++

class gender_t
{
public:
  enum value {male, female};

  gender_t (value);
  operator value () const;

private:
  ///...
};

& XML (schema, not example data):

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:simpleType name="gender_t">
    <xs:restriction base="xs:string">
      <xs:enumeration value="male"/>
      <xs:enumeration value="female"/>
    </xs:restriction>
  </xs:simpleType>

.. from http://www.artima.com/cppsource/xml_data_binding.html

jonnenauha commented 10 years ago

For me it would be enough that the int is used but the enum value it notated. Maybe a generic attribute semantic that the component could add in the attribute somehow. If you edit the semantic in txml it would not do anything, it would not even be read, just written as a note.

<attribute semantic="RunOnBoth" value="0" type="int" id="runMode" name="Run mode"/>

cadaver commented 10 years ago

The suggestion to have value="RunOnBoth" type="enum" should be doable, with both read/write support. Must just keep in mind that there's no actual enum attribute type but it's still int under the hood.

jonnenauha commented 10 years ago

I think that is worse option as we cant then freely rename the enums, which we have been doing lately for them to make more sense and to match our coding conventions etc. Its a lot more work and kind of fake, as said there is no enum type. We would have to build all the enum machinery in IAttribute which is a bit overkill imo to just know what a int value actually means.

If the target is really to make txml more human readable there are other things we should do as well :P

jonnenauha commented 10 years ago

Also, if we would want to focus something in the txml format. It should be reducing the size, not add more to it. Our txml is hugely wasteful how its now structured and the files get quite big very fast.

The most significant one would be to omit attributes if they have the components default value. This would drop the size probably 50% or more :) Would make also a lot more readable as you could see what you actually have set/changed and where.

cadaver commented 10 years ago

The enum serialization should work with no added code into the components or attributes, in the same way as editing already works, as long as there exists a live component at the time of saving and loading.

To support enum renames, yes we'd need to save the int value as well (which increases the file size as you say), and make it authoritative, in which case hand-editing would not be possible unless the user also updated the int value.

But is the hand-editing of txmls actually an important usecase today?

Stinkfist0 commented 10 years ago

XML is verbose, there's no way around it. Maybe we would want to introduce a less verbose JSON scene format perhaps if the size of TXML begins to worry us.

Renames would not be a problem. First of all renames (typo or inconsistency with the coding convention) should not happen, or if it happens, it should be fixed very early I during the initial code review. I did some experimenting (did not commit, but I have the code somewhere) with this excellent little class @cadaver had implement a long time ago https://github.com/realXtend/tundra/blob/tundra2/src/Core/OgreRenderingModule/OgreMaterialAsset.cpp#L21 I made a template version of it, added case-insensitivity option etc. This could be used for the task in hand. When making the name-value mapping we could keep the possible old and deprecated symbol names in the map also.

In general hand-editing of TXML is not really important use case, but I have found it irritating couple times when I have to go to digging the source code in order to find f.ex. what light type 1 is.

jonnenauha commented 10 years ago

Well its verbose, but we can still do lots of things to reduce the size. Like said attributes that have the default value should not be written at all. My main irritation reading txml is finding stuff that has actually touched as 90% of the file is default values that I really dont care to even see.

This would not require any changes into reading txml as Components init default values anyway pre deserialization, those values would just not be written. This would also make txml loading probably quite a lot faster (relatively to what it is now, again reading 90% of bytes has no effect to the scene state, just re-setting default values).

This would also make the system more robust as if we change default values, old txml would be upgraded automatically if the attribute value is not modified. Today old txml:s load the old default values.

I don't think hand editing should be a concern here. If you edit something its going to Placeable.transform or AssetReferences with search and replace to change base url or something like that. End users (non devs) should not really ever need to open a txml with a text editor, at least from Meshmoons perspective :)

I'm down with the enums to names, but I still think you are underestimating how quick this is to do. I know its not going to affect all the IComponent implementations but probably the machinery in IComponent and IAttribute and possibly in each components need to declare the enum names (which is probably mostly already done for EC editor into AttributeMetadata). Just from my point of view this is not a trivial change that I can justify doing on Meshmoon time. This has no impact to end users, just us devs happy to read txml files ;) But the change is welcome if someone does it.

jonnenauha commented 10 years ago

And on the other hand reducing txml size has direct impact to Meshmoon operations like pushing smaller files to the cloud and starting scenes faster due to smaller download and deserialization etc. This is a genuine win for us and I would like to implement this at some point.

I've also been thinking if we could make some compressed version as we have zlib etc. now present. This would be cool option but core might not be the proper place to imaplement this. Once we have the bytes we can do whatever with them.

Stinkfist0 commented 10 years ago

"Slim TXML" (do not write attributes with default values etc.), which we have discussed earlier too, IIRC, sounds good. Could be made into its own issue. Off to making a new issue for another matter that I just recalled!