Open supervitis opened 7 years ago
Thanks for reporting this @supervitis, and providing an example. Something definitely looks wrong. If I turn off ordering during diffing and patching, things work, but with ordering on (the default), it fails horribly. Will investigate.
Sure, no problem. Actually if ordering helps with the patching I will try with more files I am using right now and let's see.
I am extensively using the tool as part of my Master Thesis, a research on Semi-structured data evolution with focus on CSV. I hope this is ok for you, I am referencing your work properly. Also, if there is anything else I discover I will let you know.
Using daff as part of thesis is totally cool (makes me v happy), it is free/open source in any case.
What could be useful is paring this down to the minimum pair of csv files that tickle the problem, by removing parts bit by bit until the problem goes away, then putting that last bit in again. That'll be the first thing I do once I get time to delve into this.
At the moment I am pretty busy with the final steps of my experiments and writing so I am going to stick with the files that are patched without problem. However, I have a corpus of data with thousands of sources, each one with different versions, that I will try to run with the tool.
This process will give me an insight about the smallest file where this error is arising, and it is probably the best target for the bits analysis since there are files smaller than the one I've uploaded.
With that file I can do the analysis. However, I translated the last version of the library to Java with but with ordering off it still has the same error.
I've found the patch method from coopy sends null by default as CompareFlags, so I modified that line in the Coopy.java. Still, output is unchanged with CompareFlags.ordered = false
Is there anything else buggy, or it's something I am doing wrongly?
I am calling it as:
coopy.CompareFlags flags = new coopy.CompareFlags();
flags.ordered = false;
coopy.Coopy.patch(table1, table2, flags);
Understood.
For the flags, you'd need to set ordered = false when the diff is being generated, not just when patching.
On Mon, Mar 27, 2017 at 1:24 PM, David Riobo notifications@github.com wrote:
At the moment I am pretty busy with the final steps of my experiments and writing so I am going to stick with the files that are patched without problem. However, I have a corpus of data with thousands of sources, each one with different versions, that I will try to run with the tool.
This process will give me an insight about the smallest file where this error is arising, and it is probably the best target for the bits analysis since there are files smaller than the one I've uploaded.
With that file I can do the analysis. However, even with the most updated version of the code I created the Java translation but with ordering off it still has the same error.
I've found the patch method from coopy sends null by default as CompareFlags, so I modified that line in the Coopy.java. Still, output is unchanged with `CompareFlags.ordered = false
Is there anything else buggy, or it's something I am doing wrongly?
I am calling it as:
coopy.CompareFlags flags = new coopy.CompareFlags(); flags.ordered = false; coopy.Coopy.patch(table1, table2, flags);
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/paulfitz/daff/issues/89#issuecomment-289523423, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHOX8cgBpw4f47T95hilfopZFzMsVtnks5rp_DPgaJpZM4MpjDu .
Sorry for not mentioning that, I do it when comparing as well but there is no change :/
I wrote a program to randomly strip out lines from the CSVs to find minimal examples that tickle a problem. An example:
Others are similar. Duplicated rows are definitely hard to deal with. I'll check the logic.
I am finding an incorrect patching in Java in some of the files I am analyzing.
To be more specific, with these two for example that come from OpenData portals the daff output is perfect but when applying
coopy.Coopy.patch(table1, table2, null)
the modified table1 is a complete mess.I am not sure but could it be due to the language/encoding? The problems I encountered happened with some of the files in not only English language
20140812120302.txt 20140822230142.txt
And this is the output:
20140822230142.txt