vezel-dev / novadrop

A developer toolkit for interacting with and modifying the TERA game client.
https://docs.vezel.dev/novadrop
BSD Zero Clause License
40 stars 13 forks source link

Problem with dismantling items after packing DC #35

Closed justkeepquiet closed 2 years ago

justkeepquiet commented 2 years ago

There is a problem with dismantling items in the inventory. The problem occurs after unpacking and packing DC (also without changing an xml files). https://www.youtube.com/watch?v=cTiMCcDT4XA

Also the problem is not present when using the repack command.

alexrp commented 2 years ago

Need to see if I even have anything to dismantle on my account...

alexrp commented 2 years ago

I just tried with a data center roundtripped through XML on patch 115 and I don't see the issue when dismantling e.g. this. Also tried a few Kaia's Fury items on my archer.

Can you share more details? Does it happen on other items or just those you showed in the video?

justkeepquiet commented 2 years ago

This problem occurs on all items. But it does not always appear, on several repackings this problem was not. Then it appeared again (I tried it on the unpacked DC without changing the files).

alexrp commented 2 years ago

Can you try this diff:

diff --git a/src/tools/dc/Commands/PackCommand.cs b/src/tools/dc/Commands/PackCommand.cs
index 47c54c4..4a95c88 100644
--- a/src/tools/dc/Commands/PackCommand.cs
+++ b/src/tools/dc/Commands/PackCommand.cs
@@ -55,7 +55,11 @@ sealed class PackCommand : Command

                 await Parallel.ForEachAsync(
                     files,
-                    cancellationToken,
+                    new ParallelOptions
+                    {
+                        CancellationToken = cancellationToken,
+                        MaxDegreeOfParallelism = 1,
+                    },
                     async (file, cancellationToken) =>
                         await DataSheetLoader.LoadAsync(file, handler, dc.Root, cancellationToken));

Just to rule out the possibility of a concurrency issue?

justkeepquiet commented 2 years ago

And note: If I run a TeraToolbox with the Instant-Evreting module (which removes the dismantling animation), the dismantling of the item occurs normally.

justkeepquiet commented 2 years ago

Can you try this diff:

I repacked now with your patch. The problem has not occurs now. But it did not always occurs, more tests are needed.

justkeepquiet commented 2 years ago

I try to pack other DCs.

justkeepquiet commented 2 years ago

Other DC (with changed xmls) also packed without this problem.

alexrp commented 2 years ago

Since it's a non-deterministic issue, I'll let this issue sit for a day or two and check back in. Hopefully after working with the diff above for some time we should know if it's a concurrency problem.

alexrp commented 2 years ago

@justkeepquiet Has the issue reappeared again while using the diff above?

justkeepquiet commented 2 years ago

I have repackaged DC many times, with many file changes. This problem never occurred, and no other problems were found.

alexrp commented 2 years ago

@justkeepquiet Just to confirm,

I have repackaged DC many times, with many file changes.

This is with the diff above, right?

justkeepquiet commented 2 years ago

Yes.

alexrp commented 2 years ago

Would you mind trying without the diff for a day or two also just to verify? If it starts happening again without the diff, then I think we can very safely conclude that the concurrent loading of data sheets is the problem.

I just want to make sure we rule out random network weirdness (or similar) before I start chasing a non-deterministic bug that may or may not exist. 😉

justkeepquiet commented 2 years ago

Okay.

justkeepquiet commented 2 years ago

I packed now without diff. The problem has appeared again.

alexrp commented 2 years ago

Thanks. This at least gives me something to go on. Hopefully I can track this down.

alexrp commented 2 years ago

Good news: We've probably found the root cause of this issue.

When we load data sheets in parallel, we add them as children on the __root__ data center node in a non-deterministic order. We used to think this was fine because we sort them when serializing the final data tree anyway.

But it turns out that the string table containing node/attribute names has to be sorted in a very particular way. During serialization, we construct the name table by doing a pre-pass over the data tree and adding names as we go. Because of the way we name unpacked XML files, we accidentally satisfy the sorting requirement when data sheets are loaded serially from disk in the order returned by the file system (alphabetical for NTFS). But throw in parallelism and this breaks randomly.

Consider two data sheets like this:

<A>
    <B a="..." b="..." />
    ...
</A>
<A>
    <B c="..." />
    ...
</A>

The desired name table here would be:

A
B
a
b
c

But you can imagine that if we process the second file before the first file, we instead end up with a name table like this:

A
B
c
a
b

So, basically, we need to do something a bit more sophisticated to construct a correct name table. We need to build up a 'name tree' that describes all the possible node/attribute names. Each level of the name tree contains a node name and a list of attribute names. After construction, we sort all nodes by name, and all attributes by key status and name. Finally, we walk the name tree and add all names to the table. We then sort and serialize the data tree as usual and it should (hopefully) all be sorted correctly.

alexrp commented 2 years ago

@justkeepquiet Please try again with a fresh build of commit 03f14a3ed01fe9de3d52f9dfbf498075a5a69d5d and without the diff from earlier. Hopefully this is fixed now.

justkeepquiet commented 2 years ago

I now checked the latest revision. Unfortunately, the problem is still present.

alexrp commented 2 years ago

OK, I have another idea related to name sorting that might resolve it - will push some code tomorrow.

alexrp commented 2 years ago

@justkeepquiet Can you try latest master with this diff?

diff --git a/src/tools/dc/Commands/UnpackCommand.cs b/src/tools/dc/Commands/UnpackCommand.cs
index 6b50006..9f6797e 100644
--- a/src/tools/dc/Commands/UnpackCommand.cs
+++ b/src/tools/dc/Commands/UnpackCommand.cs
@@ -96,8 +96,7 @@ sealed class UnpackCommand : Command
                     Async = true,
                 };

-                // Official data center files have some empty sheets at the root that are safe to drop.
-                var realSheets = sheets.Where(n => n.HasAttributes || n.HasChildren || n.Value != null).ToArray();
+                var realSheets = sheets.ToArray();

                 await Parallel.ForEachAsync(
                     realSheets

You will need to do a fresh unpack (official DC) -> pack sequence, because this change makes it so some empty data sheets that are normally dropped when unpacking are preserved.

I'm not sure if this will make any difference, but I guess it's worth trying.

justkeepquiet commented 2 years ago

I check this way. Unfortunately, problem has not been solved.

alexrp commented 2 years ago

@justkeepquiet Please try with a clean master build now:

$ dotnet run -c Release unpack "/c/Program Files (x86)/Steam/steamapps/common/Tera/Client/S1Game/S1Data/DataCenter_Final_EUR.dat" DC-A
Unpacking 'C:/Program Files (x86)/Steam/steamapps/common/Tera/Client/S1Game/S1Data/DataCenter_Final_EUR.dat' to 'DC-A'...
Unpacked 24849 data sheets in 00:00:20.7808577.
$ dotnet run -c Release pack DC-A DC.dat
Packing 'DC-A' to 'DC.dat'...
Packed 24849 data sheets in 00:01:13.3399562.
$ dotnet run -c Release unpack DC.dat DC-B
Unpacking 'DC.dat' to 'DC-B'...
Unpacked 24849 data sheets in 00:00:20.2867095.
$ diff -r -u DC-A DC-B | wc -l
0

If it doesn't work with this fix, then I'll be at a complete loss. We're now producing basically identical data centers to the official ones (with the only difference being #21, but that should be irrelevant).

justkeepquiet commented 2 years ago

I test on EUR DC 100.02 - work fine. More tests may be needed, but the problem does not appear now.

alexrp commented 2 years ago

Great! Hopefully it's fixed for good, but if not, feel free to reopen the issue.