Closed thomas-tran closed 11 years ago
The dump size shouldn't be a problem. (The biggest dump I've heard of someone using svndumpsanitizer on was >700GB)
As for the rest, it unfortunately doesn't contain enough information to diagnose the problem :-( To get an outfile of 0 bytes sounds really odd, though... Even if you somehow excluded everything in the repository, at least revision 0 should be written. What happens if you don't specify an outfile, and let it write to stdout instead?
Knowing the actual full command you're using would also be useful. Having the dump file would be nice as well, but I understand that such a behemoth is quite unwieldy, and probably contains sensitive data...
That looks ok... What happens if you omit the outfile parameter?
I haven't tried that, what will happen if I omit output file. Will it overwrite the input?
On Sun, Oct 20, 2013 at 3:48 PM, dsuni notifications@github.com wrote:
That looks ok... What happens if you omit the outfile parameter?
— Reply to this email directly or view it on GitHubhttps://github.com/dsuni/svndumpsanitizer/issues/4#issuecomment-26664956 .
No. It should just write everything to stdout.
Thank you very much for such a great support. I tried to omit output file and write to stdout, here is the result
Revision-number: 88147 Prop-content-length: 133 Content-length: 133
K 7 svn:log V 22 Deleted unwanted nodes K 10 svn:author V 16 svndumpsanitizer K 8 svn:date V 27 2013-10-20T09:25:28.000000Z PROPS-END
Node-path: tags Node-action: delete
Node-path: DBScripts/To be archived Node-action: delete
On Sun, Oct 20, 2013 at 4:34 PM, dsuni notifications@github.com wrote:
No. It should just write everything to stdout.
— Reply to this email directly or view it on GitHubhttps://github.com/dsuni/svndumpsanitizer/issues/4#issuecomment-26665468 .
Ok... So it does appear to do something. And given that you specified the --drop-empy parameter, it would seem that it has indeed kept almost 90000 revisions. Now what happens if you run the exact same command and redirect the output to a file? I.e. svndumpsanitizer --infile [...] --drop-empty > filtered.dump
I tried with the output still the same result. However if I tried with another dump 120 GB then it is OK. Every dump > 160GB gave me 1kbytes file.
On Mon, Oct 21, 2013 at 12:07 AM, dsuni notifications@github.com wrote:
Ok... So it does appear to do something. And given that you specified the --drop-empy parameter, it would seem that it has indeed kept almost 90000 revisions. Now what happens if you run the exact same command and redirect the output to a file? I.e. svndumpsanitizer --infile [...] --drop-empty > filtered.dump
— Reply to this email directly or view it on GitHubhttps://github.com/dsuni/svndumpsanitizer/issues/4#issuecomment-26672674 .
That's really odd... But looking at the output again, if it only outputs that revision and nothing else it's almost like the rewind-operation (line 747 in version 1.2.1) isn't properly performed.
I've never had any problems with that myself, no matter how big the file, but someone mentioned that it for some reason it didn't work with some versions of Windows despite it being supported according to the documentation. You could try replacing that line with this one (which should do the same thing), and recompile:
fseeko(infile, 0 , SEEK_SET);
Actually I was use your old version 1.01 to compile in windows 7 and it did not ran successfully for over 150 GB. I tried to get latest from github, compiled using gcc on 32 bit version but got exception on incompatible when execute it.
Which is the best version I should get to compile in windows?
Thanks
On Mon, Oct 21, 2013 at 1:02 AM, dsuni notifications@github.com wrote:
That's really odd... But looking at the output again, if it _only_outputs that revision and nothing else it's almost like the rewind-operation (line 747 in version 1.2.1) isn't properly performed.
I've never had any problems with that myself, no matter how big the file, but someone mentioned that it for some reason it didn't work with some versions of Windows despite it being supported according to the documentation. You could try replacing that line with this one (which should do the same thing), and recompile:
fseeko(infile, 0 , SEEK_SET);
— Reply to this email directly or view it on GitHubhttps://github.com/dsuni/svndumpsanitizer/issues/4#issuecomment-26673541 .
I don't know. :-P I kicked my last windows installation to the curb back in 2005, and haven't looked back since. The windows patch is an external contribution that I haven't tested myself.
The windows patch has never changed, though, so I think all versions should be equally easy/problematic to compile. For that reason you should use the latest version, which contains some bug fixes. (I think the guy submitting the patch was using visual studio to compile it...)
After replace rewind with fseeko(infile, 0 , SEEK_SET) it worked perfectly for dump more than 300 GB.
Thank you very much
HI dsuni,
I tried to use svndumpsanitizer for my full dump file and filter about 10 paths. The original dump file about 220 GB. The output file always generate with 0 bytes no matter how I rerun many times