pencil2d / pencil

Pencil2D is an easy, intuitive tool to make 2D hand-drawn animations. Pencil2D is open source and cross-platform.
http://pencil2d.org
GNU General Public License v2.0
1.47k stars 273 forks source link

Keyframe refactoring and xml format modification proposal #1306

Open scribblemaniac opened 4 years ago

scribblemaniac commented 4 years ago

I want to propose a restructuring to how keyframes are represented in the code and the project files. I would like to hear what everyone things about this idea, but first some background.

'Recently' I was reviewing #1292 and there was one issue with it that bothered me, and that issue is with how the comments are saved. The solution is functional, but there are some issues which I'd like to go over. I'm not picking on @davidlamhauge here or anything like that, it is just a perfect motivation for the changes I want to propose. The main.xml generated by that branch looks something like this right now:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE PencilDocument>
<document>
  <projectdata>
    ...
  </projectdata>
  <object>
    ...
    <layer type="1" visibility="1" name="Bitmap Layer" id="3">
      <image topLeftX="0" topLeftY="0" src="003.001.png" frame="1"/>
    </layer>
  </object>
  <framecomments>
    <content slug="" dialogue="Hello world!" layer="2" frame="1" action=""/>
  </framecomments>
</document>

Now there are a couple things I think are wrong with this. First off, storing the strings as attributes is not ideal because they could theoretically be quite large and have have special characters which an element may be better suited for. So we could do something like this:

<framecomments>
  <content layer="2" frame="1">
    <dialogue>Hello world!</dialogue>
  </content>
</framecomments>

There's an attribute for the layer that these comments are for, but there is also that same information in the layer tag. It seems natural that the frame comments should be in the layer tag like so:

<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <image topLeftX="0" topLeftY="0" src="003.001.png" />
  <framecomment frame="1">
    <dialogue>Hello world!</dialogue>
  </framecomment>
</layer>

However after all this, there is still redundant data in the form of the frame number. This is where my proposal comes into play. Right now a keyframe and it's contents are synonymous. A BitmapImage or a SoundClip is literally a KeyFrame type in the code and the only indicator that a keyframe exists in the save file is whether there is an image take at that position or not. It's clear from the frame comments, as well as when considering potential future improvements such as easing, time based properties (ex. opacity fading in/out), or masks where the keyframes may consist of more than simply their trivial content. So what if we instead treated them as generic containers? For the file format it might start looking something like this:

<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <keyframe position="1">
    <image topLeftX="0" topLeftY="0" src="003.001.png" frame="1"/>
    <dialogue>Hello world!</dialogue>
  </keyframes>
</layer>

Now the redundant frame information is eliminated. In the code, the KeyFrame class might look something like this:

class KeyFrame
{
public:
    enum KeyframeType
    {
        BITMAP,
        VECTOR,
        MOVIE, // not supported yet
        SOUND,
        CAMERA,
        SLUG,
        DIALOGUE,
        ACTION
    };

    // Constructors, destructor and functions applicable to the keyframe as a whole here

    template <KeyframeContent T>
    T* getContent(KeyframeType type) {
        return static_cast<T>(mContents.find(type).value());
    }

    void setContent(KeyframeType type, KeyframeContent contents) {
        // Perhaps the parent layer could be called here to check for any restrictions (no sound on a bitmap layer, etc.)
        // Alternatively the Keyframe class could have subclasses for each layer type, but that sounds like it would bring back a lot of the test-and-cast code we currently have
        // Personally I think that it is layer logic and it makes sense for it to be handled there, although storing the parent layer could complicate some things as we would have to make sure that remains correct.
        mContents.insert(type, contents);
    }
private:
    QMap<KeyframeType, KeyframeContent*> mContents;

    // Any other necessary member variables for the above functions
};

Basically, a Keyframe would just hold a collection of objects that inherit from a new KeyframeContent type. The particular implementation above has a limitation of at most one of each KeyframeType per Keyframe container, but it doesn't necessarily have to be that way if it makes sense to allow multiple of some type on a single keyframe.

There are multiple advantages to this. First it would be more natural to parse the proposed xml structure above. Second it would make it easier to attach additional data to keyframes in the future. Just make a new KeyframeType entry and a sublass of KeyframeContent and you would be able to add that right away to keyframes and have them move, save, delete, etc. with the keyframe.

The main disadvantage is of course that this could take a fair amount of work to implement since keyframes are used in so many places. Fetching the KeyframeContents will also slow down some operations, although it should be minor as long as we aren't attaching a large number of different data objects to each keyframe. Changing the file format always has risks associated with it. These changes would be breaking changes to fine file format, so a version converter would need to be written into Pencil2D to open old project files. With a basic implementation there would not be backwards compatibility, but this could theoretically be achieved by adding the elements both inside and outside of the keyframe tags. It is worth noting that I believe file loading and saving needs to be rewritten anyway at some point to use Qt's xmlstreams so much of that work could be done at the same time as this.

I've probably missed a few points. I started to write this a while ago and just never got around to submitting it so I thought I would just polish it up to the best of my ability with what I can remember and put it out there to see what everyone's thoughts are on this.

MrStevns commented 4 years ago

Good proposal and I agree, there's not much structure to the current layout and as such you have to invent a new type and figure out where to place it, while if we made a more generic structure, then it would be easier to expand and improve in the future.

As for your xml, you're still repeating 'frame' and 'position', is that intentional?

I don't think it makes sense to make a keyframe type of each comment element, they're always used in the same context so they should be read as such.

it's true there could be problems with attributes, because of the way xml handles special characters. Taking that into consideration I think it's better to create a new level, something like this.

<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <keyframe position="1">
    <image topLeftX="0" topLeftY="0" src="003.001.png"/>
    <comments> 
      <dialogue>Hello world!</dialogue>
      <slug>Hello slug</slug>
      <action>Hello action</action>
    </comments> 
  </keyframe>
</layer>

I'm not sure about the KeyframeContent vs Casting between classes, if it means there would be a performance penalty but the rest sounds good to me

davidlamhauge commented 4 years ago

Good proposal @scribblemaniac ! The reason I made it as I made it, was because it seemed the best way for me: 1) As far as I could see, it wouldn't call for a change in the file format, meaning that those who used Pencil2D could load and save files as usual, whether they used comments or not. 2) I'm not sure how many would use the feature, so the redundancy wouldn't be that big a problem, I thought.

That said, I can follow your thoughts and will support any changes to the better.

scribblemaniac commented 4 years ago

As for your xml, you're still repeating 'frame' and 'position', is that intentional?

If you're referring to the frame attribute in the image tag, that was unintentional and I've just fixed it.

I don't think it makes sense to make a keyframe type of each comment element, they're always used in the same context so they should be read as such.

Yeah I agree that makes more sense, they are all basically the same "type" of data as well, just storing user entered text, at least right now anyway.

I think it's better to create a new level, something like this.

Creating another level does not add any benefit for the matter of handling special characters, but it may or may not make sense depending on how we are storing the comments. For instance, if we a keyframe storing multiple generic comment objects which inherit from KeyframeContent and stores the type of comment and the comment text, this would make more sense:

<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <keyframe position="1">
    <image topLeftX="0" topLeftY="0" src="003.001.png"/>
    <comment type="dialogue">Hello world!</comment>
    <comment type="slug">Hello slug</comment>
    <comment type="action">Hello action</comment>
  </keyframe>
</layer>

This might make more sense if we want to allow users to create their own comment types in the future.

@davidlamhauge To be clear, this issue wasn't meant to be a criticism of the way you implemented frame comments. Redundancy isn't a huge deal because we're regenerating the xml each time we save instead of trying to update it and the frames take up much more space than the xml ever will. Changing the file format is a fairly big deal, so avoiding breaking changes the way you did was the right thing to do. This is just looking forward.

MrStevns commented 4 years ago

Creating another level does not add any benefit for the matter of handling special characters, but it may or may not make sense depending on how we are storing the comments. For instance, if we a keyframe storing multiple generic comment objects which inherit from KeyframeContent and stores the type of comment and the comment text, this would make more sense:

Ah yes that makes more sense

davidlamhauge commented 2 years ago

I merged the uptodate master branch into #1292 yesterday, and was reminded about this refactoring proposal. Two years has passed. Is it time to make a decision? I can't add much to what is written, but it makes perfectly sense to make a structure that is more easily read, maintained and expanded. The problem is to implement it and to keep backwards compatibility. I'm not sure I'm very helpful in that area...

scribblemaniac commented 2 years ago

I'll offer to implement these changes, but we all need to come to an agreement on exactly what the new xml structure and related classes should look like. We should go through and find any other features requested or on the roadmap or that we could conceivably add in the future and make sure that whatever format we settle on is easily extensible for the features and won't need rewriting again down the line.

davidlamhauge commented 2 years ago

I like this implementation, and I guess this is the way we go:

<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <keyframe position="1">
    <image topLeftX="0" topLeftY="0" src="003.001.png"/>
    <comment type="dialogue">Hello world!</comment>
    <comment type="slug">Hello slug</comment>
    <comment type="action">Hello action</comment>
  </keyframe>
</layer>

And when you say that "We should go through and find any other features requested or on the roadmap or that we could conceivably add in the future...", I rely on @Jose-Moreno to come forward with ideas, since he probably is the one who has tried most animation software, plus he's maintaining our roadmap #540. I will of course try to think on coming features as well, but it's a tough job. As the Danish writer-actor-artist-poet Storm P once said, "It's hard to make predictions - especially about the future..."

Jose-Moreno commented 2 years ago

@scribblemaniac These are some that I think could change the XML specification as a whole due to requiring attributes that would need to be read from the XML to describe what's happening.

Color Assignment Node or Attribute

Frame Marking Attribute

Resource Linking Attribute

Interpolation / Easing Attribute

Frame Instancing Nodes or Attributes

Ideally the visualized instanced images should all link to the same resources in the data folder. e.g I draw an 8 frame walk cycle and I would repeat that on different parts of the timeline creating 96 frames, but would only link to 8 raw graphics.

These features could use attributes that would indicate:

Sound Editing

Frame Visibility

If I think of anything else I'll update this comment.


I'll make an adjacent comment with ideas on how the xml could be formatted but I may take some time coming up with good solutions for each problem.

Jose-Moreno commented 2 years ago

I'll make an adjacent comment with ideas on how the xml could be formatted but I may take some time coming up with good solutions for each problem.

Frame Instancing Nodes or Attributes

For attributes I imagine something like this:

<image src="002.074.png" opacity="1" topLeftX="-322"  topLeftY="222" frame="74" instance="1" frameInstances="["96"," 144","320","540"]"/>

What I'm not sure is how to treat instance exposure. If we allow for the user to manipulate instances like any normal frame then, instancing only refers to the ability to assign which frames will use the same resource as the set of frames we mark as originals. That could help optimize scenes as well, and this could be preset behavior where if you CTRL+C /V you would duplicate, but if you CTRL + ALT + C / V you would instantiate instead.

For example If I have the walk cycle in two's

[E] Empty

[1]-[E]-[2]-[E]-[3]-[E]-[4]-[E]-[5]-[E]-[6]-[E]-[7]-[E]-[8]-[E]

We would only be able to copy the relative position of each "image" node (frame) which is how we seem to measure exposure anyway (by calculating the difference between frames position)

Now, instead of attributes, we could also use an "Instance" group node where selected frames would be nested, so we end up with virtual copies, but they all point to the same resource. I see this as redundant information that would quickly make the files unnecessarily larger, but I guess that it could be clearer.

<instance startFrame="96">
      <image opacity="1" topLeftX="-322" src="002.074.png" frame="74" topLeftY="222"/>
      <image opacity="1" topLeftX="-336" src="002.073.png" frame="73" topLeftY="208"/>
      <image opacity="1" topLeftX="-270" src="002.071.png" frame="71" topLeftY="232"/>
      <image opacity="1" topLeftX="-247" src="002.069.png" frame="69" topLeftY="245"/>
      <image opacity="1" topLeftX="-248" src="002.067.png" frame="67" topLeftY="244"/>
      <image opacity="1" topLeftX="-247" src="002.065.png" frame="65" topLeftY="245"/>
      <image opacity="1" topLeftX="-247" src="002.063.png" frame="63" topLeftY="245"/>
</instance>

<instance startFrame="192">
      <image opacity="1" topLeftX="-322" src="002.074.png" frame="74" topLeftY="222"/>
      <image opacity="1" topLeftX="-336" src="002.073.png" frame="73" topLeftY="208"/>
      <image opacity="1" topLeftX="-270" src="002.071.png" frame="71" topLeftY="232"/>
      <image opacity="1" topLeftX="-247" src="002.069.png" frame="69" topLeftY="245"/>
      <image opacity="1" topLeftX="-248" src="002.067.png" frame="67" topLeftY="244"/>
      <image opacity="1" topLeftX="-247" src="002.065.png" frame="65" topLeftY="245"/>
      <image opacity="1" topLeftX="-247" src="002.063.png" frame="63" topLeftY="245"/>
</instance>

# and so on ...

If I didn't care about the realities of XML and we really wanted to get the original image node down, I'd love to pass attributes a value that reference other nodes attributes value. I'm not sure we can, nevertheless here's the idea.

<imageInstance src="@frameSrc.src" opacity="@frameSrc.opacity" topLeftX="@frameSrc.topLeftX" topLeftY="@frameSrc.topLeftY" frame="96" frameSrc="74"/>
# frameSrc could get an <image> object instead
davidlamhauge commented 2 years ago

The only thing I can think of, that Jose hasn't mentioned, is the about the camera. As you've seen on Discord, I am hoping for users to test the 'new' camera, stage 2. Stage 3 in new camera development concerns implementing a multiplane camera, with depth of field. This requires the following data:

Camera: (following values can vary from frame to frame, thus shifting focus) Aperture Focal length Distance setting

Layer: (Each layer has a fixed distance throughout a scene. That's how I see it) Distance from camera. (Bitmap and Vector layers)

I think these informations will do it.

chchwy commented 2 years ago

This is a good proposal. However, in my opinion, we should keep the old structure as much as possible. Extending the old XML structure should fulfill our needs.

Rather than adding a new <keyframe> tag. We could just put the comment tags (or any new feature in the future) in the existing <image> tag:

<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <image src="002.003.png" frame="3" topLeftX="-238" opacity="1" topLeftY="-132">
    <comment type="dialogue">Hello world!</comment>
    <comment type="slug">Hello slug</comment>
    <comment type="action">Hello action</comment>
  </image>
</layer>

There are 2 advantages:

  1. Less work (potentially less bugs). We could avoid the situation of dealing with two different XML structures and save the time of implementing a file structure converter.
  2. Compatibility. The old Pencil2D program can still load the projects generated from newer Pencil2D programs. The top structures are the same, old Pencil2D could simply ignore the <comment> tags and still work ok.
chchwy commented 2 years ago

And for the KeyframeContent design, it doesn't seem to make sense to me.

A map is a flexible structure for a use case like: a keyframe may have an arbitrary number of elements, some may exist and some may be missing. Is this kind of flexibility really what we want?

MrStevns commented 2 years ago
<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <image src="002.003.png" frame="3" topLeftX="-238" opacity="1" topLeftY="-132">
    <comment type="dialogue">Hello world!</comment>
    <comment type="slug">Hello slug</comment>
    <comment type="action">Hello action</comment>
  </image>
</layer>

We could do that but in that case I see no reason to talk about the format of the xml document because we're not changing it, we're simply adding to it. I'm personally fine with that, because it's as you say, a lot easier than migrating to a new format.

scribblemaniac commented 2 years ago

However, in my opinion, we should keep the old structure as much as possible. Extending the old XML structure should fulfill our needs.

I agree with the advantages you mentioned, but I disagree that your approach is the best approach right now. While it is no longer as important to migrate from Qt XML to QXmlStreamWriter/Reader since they implemented Qt XML in Qt 6, it's not under active development and it still seems to be discouraged so I feel it's still a good idea to migrate to XmlStreams. To do that, we have to rewrite basically all of the file loading and saving code interacting with the XML file anyway, so there's no better time consider changes to the XML format in my opinion. We have never promised forward-compatibility, so I'm not too concerned with breaking that as long as the latest version allows users to load older pclx files and and perhaps the option to save in the older format in the save dialog.

<layer type="1" visibility="1" name="Bitmap Layer" id="3">
  <image src="002.003.png" frame="3" topLeftX="-238" opacity="1" topLeftY="-132">
    <comment type="dialogue">Hello world!</comment>
    <comment type="slug">Hello slug</comment>
    <comment type="action">Hello action</comment>
  </image>
</layer>

To me the biggest problem with this is semantics and the implications this has on the code structure. Is a comment part of an image? There could be an argument to be made for that, but what if you are on an x sheet and you want to make a comment on a frame that does not have a keyframe? Will you have to create a new bitmap image just to save your comment? And is BitmapImage really the the appropriate place to be storing comments? I could see it making sense sometimes but not other times. If you want to use this XML format, but want to store the comments elsewhere, how would those comment objects be loaded with information from within the BitmapImage's xml load function?

This coupling of data with keyframes becomes even more problematic with some of the other data we may store in the future. Say we want to fade out a layer while is playing an animation. You set the opacity on the first and last frames, but it can't interpolate between them because all the keyframes in between also have opacity values. To me there is an important distinction here between things that are properties of keyframes, for which this format would be acceptable, and things which are properties of the layers at specific frames, which would be better suited to my proposed format.

A map is a flexible structure for a use case like: a keyframe may have an arbitrary number of elements, some may exist and some may be missing. Is this kind of flexibility really what we want?

This is the flexibility I intended with my proposal. While it's not strictly necessary right now, I do think it is better for future extensibility. I don't think we should be trying to make a single Keyframe object hold everything, or each new feature is going to bloat that class and its subclasses.

Looking through all the things Jose has posted, as well as the additional things I have come up with (which I will post shortly), I'm not convinced that my initial proposal is the best it can be yet. I'm going to keep tinkering with this for a while and see if I can come up with something better.