jefffhaynes / BinarySerializer

A declarative serialization framework for controlling formatting of data at the byte and bit level using field bindings, converters, and code.
MIT License
293 stars 62 forks source link

Chicken and Egg when working with EA-IFF 85 #12

Closed jda808 closed 9 years ago

jda808 commented 9 years ago

Having a hard time finalizing my implementation for some knarly EA-IFF 85 form chunks. No matter how I strucuture or wire it up, I always end up having to implement IbinarySerializer. The stream position refuses to increase when I try to add padding via converts or props. chunkhierarchy

there are 0 or more REFE chunks and the AUTH chunk is optional.

here's one (of about 10) different ways I've tried implementing REFS:

''' ///

/// There is a REFE chunk for every reference to XXX
/// you can have 0 or more REFE chunks. Currently we work with 
///  two versions of this chunk: the version wriXXXXXXXXX
/// </summary>

public class RefeChunk //: IffChunk //, IBinarySerializable
{
    [FieldOrder(0)]
    [FieldLength(4)]
    public string ChunkName { get; set; }

    [FieldOrder(1)]
    [SerializeAs(Endianness = Endianness.Big)]
    public UInt32 ChunkSize { get; set; }

    [FieldOrder(2)]
    [FieldLength("ChunkSize")]
    [SerializeAs(Encoding = "UTF-8", Endianness = Endianness.Big)] //, SerializedType = SerializedType.ByteArray)]                                                                                                              
    public RefePayload RefePayload { get; set; }

    //public byte[] RefePayload { get; set; }

    [Ignore]
    public int PadLength
    {
        get { return (ChunkSize % 2 == 0) ? 0 : 1; }
    }

    [FieldOrder(3)]
    //[SerializeAs(SerializedType = SerializedType.ByteArray)]
    //[FieldLength("PadLength")]    //, ConverterType = typeof(PadHackConverter)]
    //[SerializeAs(SerializedType = SerializedType.ByteArray)]
    [SerializeWhen("PadLength", 1)]
    //[FieldLength("PadLength")]
    public byte Padding { get; set; } =0x00;

    //
    //public RefeChunk() : base("REFE")
    //{
    //}

    //public void Serialize(Stream stream, Endianness endianness, BinarySerializationContext serializationContext)
    //{
    //  throw new NotImplementedException();
    //}

    //public void Deserialize(Stream st, Endianness endianness, BinarySerializationContext serializationContext)
    //{
    //  BinarySerializer bs = new BinarySerializer();
    //  EndianAwareBinaryReader br = new EndianAwareBinaryReader(st, Endianness.Big);
    //  bs.Endianness = Endianness.Big;
    //  this.ChunkName = new string(br.ReadChars(4));
    //  this.ChunkSize = br.ReadUInt32();
    //  this.ChunkVersion = bs.Deserialize<WankyVersion>(st);
    //  Debug.IndentLevel = 0;
    //  Debug.WriteLine(this.ChunkName + " size:" + this.ChunkSize + " " + this.ChunkVersion);

    //  this.RPV3= bs.Deserialize<RPV3>(st);

    //  this.DBV3= bs.Deserialize<DBV4>(st);
    //  byte B13 = br.ReadByte();
    //  this.Reserved13 = B13;
    //  Debug.WriteLineIf(this.Reserved13 != 13, "!!!!!!!!!!! RESERVED BYTE 13 iS BAD !!!!!!!!!!!! (SLAPS WRIST)");
    //  if (this.ChunkVersion >= this.MaxVersion)
    //  {
    //      Debug.WriteLine("REFE Chunk version is >= 1,3,0 so we are going to parse some more stuff...");
    //      //OTHER STUFF blah blah
    //  }
    //  //if odd size, read an extra byte
    //  if (ChunkSize % 2 != 0)
    //  {
    //      byte stubbbby = br.ReadByte();
    //  }
    //  //db
    //  //a
    //}

Surely there is a way to avoid IBinarySerializer for such a simple concept, right?

jefffhaynes commented 9 years ago

Yes, I think so. Let me work on it.

jefffhaynes commented 9 years ago

I think the correct way to implement this is with a recursive object structure. However, currently the implementation doesn't support recursive objects. Let me work on fixing that and then I'll post a solution, thanks.

jefffhaynes commented 9 years ago

Ok, try something like this using the new 4.0 I just pushed.

public abstract class Chunk
{
}

public class FormChunk : Chunk
{
    [FieldOrder(0)]
    [FieldLength(4)]
    public string TypeId { get; set; }

    [FieldOrder(1)]
    public List<ChunkContainer> Chunks { get; set; }
}

public class CatChunk : Chunk
{
    public List<ChunkContainer> Chunks { get; set; } 
}

public class RefeChunk : Chunk
{
    [SerializeAs(SerializedType.SizedString)]
    public string SomeStuffInThisChunk { get; set; }
}

public class ChunkContainer
{
    [FieldOrder(0)]
    [FieldLength(4)]
    public string TypeId { get; set; }

    [FieldOrder(1)]
    [SerializeAs(Endianness = BinarySerialization.Endianness.Big)]
    public int ChunkLength { get; set; }

    [FieldOrder(2)]
    [FieldLength("ChunkLength")]
    [Subtype("TypeId", "FORM", typeof(FormChunk))]
    [Subtype("TypeId", "CAT ", typeof(CatChunk))]
    [Subtype("TypeId", "REFE", typeof(RefeChunk))]
    public Chunk Chunk { get; set; }

    [FieldOrder(3)]
    [SerializeWhen("ChunkLength", false, ConverterType = typeof(IsEvenConverter))]
    public byte Pad { get; set; }
}

public class IsEvenConverter : IValueConverter
{
    public object Convert(object value, object parameter, BinarySerializationContext context)
    {
        var intValue = System.Convert.ToInt32(value);
        return intValue%2 == 0;
    }

    public object ConvertBack(object value, object parameter, BinarySerializationContext context)
    {
        throw new NotImplementedException();
    }
}

And so on...let me know if it works or not, thanks

Code is here: https://github.com/jefffhaynes/BinarySerializer/tree/master/BinarySerializer.Test/Issues/Issue12

jda808 commented 9 years ago

This is great! Feels like you read my mind. I

jefffhaynes commented 9 years ago

Awesome, glad to hear it. Not sure if your comment got truncated but did you get it to work?

jda808 commented 9 years ago

Chevking now. And yes, I had a PEBKAC issue with the web browser.

jda808 commented 9 years ago

it looks like you fixed some initialization issues. I no longer have to explicitly initialize complex objects if they are deep in the graph. The recursive support is much appreciated.

Tried adjusting the example but can't get it to work. There's an additional chunk that sits between CAT and REFE, called REFS (see below). REFS has 0 or more REFE's. Either padding gets off for DESC or there's padding issues between the REFEs, or the last REFE is not complete. I've tried tackling this with ItemSerializeUntil, SerializeWhen, and FieldLength... but can't seem to get it ironed out.

Parameter UNIT Description
Form char(4) "FORM"
length UInt32 size
ChunkName char(4) "PTCH"

The REFS chunk contains ZERO or more REFE chunks:

Parameter UNIT Description
Cat char(4) "CAT "
length UInt32 Size
ChunkName char(4) "REFS"

There's the REFE chunks: 0 or more...

Parameter UNIT Description
ChunkName char(4) "REFE"
length UInt32 Size
... .... ...

Then a DESC chunk. It's always there

Parameter UNIT Description
ChunkName char(4) "DESC"
length UInt32 Size
stuff stuff stuff

The BEER chunk is optional. Sometimes its there sometimes it not.

Parameter UNIT Description
ChunkName char(4) "AUTH"
length UInt32 Size
... .... ...

There are whole bunch of other chunks but we need one more to complete the use case let's call it "SNAX"

Parameter UNIT Description
ChunkName char(4) "SNAX"
length UInt32 Size
... .... ...
jefffhaynes commented 9 years ago

I updated the example code at https://github.com/jefffhaynes/BinarySerializer/tree/master/BinarySerializer.Test/Issues/Issue12

It looks like I probably left a chunk name field out of the CAT chunk. However, just to be clear (assuming I'm reading the IFF spec correctly), there are Type IDs, and there are names (sometimes confusingly called types). Not all chunk types appear to have names, but all chunk types must have Type IDs (e.g. FORM, LIST, CAT , etc.). I'm assuming "REFS" is the name for a chunk with Type ID "CAT " in your case. So when you list, for example, "SNAX" as the chunk name, I don't think that's correct. SNAX would be the Type ID, not the name.

Essentially, it seems to me that every chunk starts with TypeID and Length (these fields show up in the ChunkContainer class in my code). Then within a specific chunk type (only FORM and CAT as far as I can tell) there may be a Name field specified. Again, that doesn't appear to be true for all chunk types.

As for padding, I believe the Pad field in the ChunkContainer class will always correctly adjust for odd padding irrespective of the chunk definition as it is bound to the local chunk length via a converter.

The real "magic" in the code comes from the Subtype attributes, which tell the serializer which chunk type to serialize or deserialize based on the TypeID field. If you need to add additional chunk types, just declare them and add a corresponding Subtype attribute in the ChunkContainer class.

Do you have example files you're using to test this? If so, it would be nice if I could commit one into the unit test.

jda808 commented 9 years ago

Gladly send you some example data. Do you want my existing unit test as well? Should I push it through git or can i just email you a zip file?

On Tue, Apr 21, 2015 at 10:04 PM, Jeff Haynes notifications@github.com wrote:

I updated the example code at https://github.com/jefffhaynes/BinarySerializer/tree/master/BinarySerializer.Test/Issues/Issue12

It looks like I probably left a chunk name field out of the CAT chunk. However, just to be clear (assuming I'm reading the IFF spec correctly), there are Type IDs, and there are names (sometimes confusingly called types). Not all chunk types appear to have names, but all chunk types must have Type IDs (e.g. FORM, LIST, CAT , etc.). I'm assuming "REFS" is the name for a chunk with Type ID "CAT ". So when you list, for example, "SNAX" as the chunk name, I don't think that's correct. SNAX would be the Type ID, not the name.

Essentially, it seems to me that every chunk starts with TypeID and Length (these fields show up in the ChunkContainer class in my code). Then within a specific chunk (only FORM and CAT as far as I can tell) there may be a Name field specified. Again, that doesn't appear to be true for all chunk types.

As for padding, I believe the Pad field in the ChunkContainer class will always correctly adjust for odd padding irrespective of the chunk definition as it is bound to the local chunk length via a converter.

The real "magic" in the code comes from the Subtype attributes, which tell the serializer which chunk type to serialize or deserialize based on the TypeID field. If you need to add additional chunk types, just declare them and add a corresponding Subtype attribute in the ChunkContainer class.

Do you have example files you're using to test this? If so, it would be nice if I could commit one into the unit test for this.

— Reply to this email directly or view it on GitHub https://github.com/jefffhaynes/BinarySerializer/issues/12#issuecomment-95010415 .

jefffhaynes commented 9 years ago

You can email it if you want.

jda808 commented 9 years ago

The padding works brilliantly, it looks like some LazyLoad issues going on...

        //This Doesn't work. Deep in the graph objects are not set to reference
        FormChunk form = bs.Deserialize<FormChunk>(st);

        //This works to a point, but fails at the body chunk
        br.ReadChars(4);
        br.ReadUInt32();
        br.ReadChars(4);
        br.ReadChars(4);
        br.ReadUInt32();
        br.ReadChars(4);
        var refs = bs.Deserialize<ChunkContainer>(st);
        var desc = bs.Deserialize<ChunkContainer>(st);
        var parm = bs.Deserialize<ChunkContainer>(st);
        var body = bs.Deserialize<ChunkContainer>(st); //<-- Body chunk has nested class that has a List with a SerializeWhen Attribute that uses a converter. 

        //This Does work though
        br.ReadChars(4);
        br.ReadUInt32();
        br.ReadChars(4);
        br.ReadChars(4);
        br.ReadUInt32();
        br.ReadChars(4);
        var refs = bs.Deserialize<ChunkContainer>(st);
        var desc = bs.Deserialize<ChunkContainer>(st);
        var parm = bs.Deserialize<ChunkContainer>(st);
        br.ReadChars(4);
        br.ReadUInt32();
        var bodyChunk = bs.Deserialize<BodyChunk>(st);
jefffhaynes commented 9 years ago

Yeah, I'm looking at it now. You actually need to Deserialize<ChunkContainer>, not <FormChunk> at the top. However, there is still something going on further down...

jefffhaynes commented 9 years ago

I just got it working. I think there were just missing chunks (BODY and PARM). The details of the chunks are missing but 1Z1S deserializes. Take a look...

2Z2S also works

jefffhaynes commented 9 years ago

Did you get this to work?

jda808 commented 9 years ago

Are the builds between Nuget & your Github repositiory the same? If I update my clone using fetch via git then the assembly version shows 4.0.1. The version on Nuget repo says 4.0.2.

If they are, then no, not yet. In the body chunk of this implementation, there are a few dynamic classes where there is a byte that only read/write if the GroupVersion is 3,0,0 or greater. The following test case is using a GroupVersion of 2,0,0. We utilize a SerializeWhen attribute against the GroupMono Byte... I'll email you the spec & updated code

Test Name: VerifyBodyChunk Test Outcome: Failed Result Message: Test method XXX.Tests.XXXX.VerifyBodyChunk threw exception: System.InvalidOperationException: Error deserializing . ---> System.InvalidOperationException: Error deserializing . ---> System.InvalidOperationException: Error deserializing Chunk. ---> System.InvalidOperationException: Error deserializing Groups. ---> System.InvalidOperationException: Error deserializing . ---> System.InvalidOperationException: Error deserializing GroupMono. ---> System.NullReferenceException: Object reference not set to an instance of an object. Result StandardOutput:
TypeId: BODY ChunkLength: 527 ReservedHead: 188 Major: 1 Minor: 0 Revision: 0 PostReserved: 0 ChunkVersion: Version(1,0,0) ReservedHead: 188 Major: 2 Minor: 0 Revision: 0 PostReserved: 0 GroupVersion: Version(2,0,0) GroupCount: 1 KeyPolyphony: 1 KeyMode: Legato

FWIW Saw 4.0.2 update via nuget. I wasn't paying attention to what I was doing and updated the Nuget Extension for VS 2015. Now when I go to update the package, the install fails. Apparently the nuget extension update is not compatible with CTP6. It appears I have to reinstall vs 2015 to restore nuget functionality .

jefffhaynes commented 9 years ago

No, I noticed this morning that I failed to commit 4.0.2 to github. I'll fix it tonight.

jda808 commented 9 years ago

Just tried out 4.0.2. No luck. Additionally, the workarounds I was using to force initialization on earlier builds no longer work.

jefffhaynes commented 9 years ago

github is updated now (to 4.0.3 to fix a bug with ignore fields) but I haven't had a chance to look at the files you sent me yet.

jefffhaynes commented 9 years ago

Based on our conversations I think this is all set now but correct me if I'm wrong.