Closed idbxy closed 5 years ago
Hi there,
I haven't seen that error before.
Unfortunately am camping at the moment, the earliest I can look at it is Tuesday.
The only thing I can think of trying would be to try downloading the previous stack dump to the one you were trying. (I realize that's quite a hassle).
Cheers, Ben
On Fri, 23 Aug 2019, 12:43 idbxy, notifications@github.com wrote:
I get an error when reaching 8GB out of 18GB of parsing the comments.xml
https://i.imgur.com/9AdCzm4.png
What's the cause of this? Is it possible to answer this quickly, I need it in 2 days before I leave
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tools4j/stacked-off/issues/1?email_source=notifications&email_token=AANM327F4QEYKKUFKVIJP7DQF7EPVA5CNFSM4IO6YKUKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HHAV3PA, or mute the thread https://github.com/notifications/unsubscribe-auth/AANM32ZTSON4AYZIXAUV2WLQF7EPVANCNFSM4IO6YKUA .
So I downloaded the previous year stack dump and had the same issue again
at org.tools4j.stacked.index.FileInZipParser.start(SeZipFileParser.kt:189) at org.tools4j.stacked.index.ExtractCallback$getStream$2.run(SeZipFileParser.kt:104) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: org.tools4j.stacked.index.XmlFileParserException: Write end dead child number [52986815] at org.tools4j.stacked.index.XmlFileParser.parseElements(XmlFileParser.kt:64) at org.tools4j.stacked.index.XmlFileParser.parse(XmlFileParser.kt:20) at org.tools4j.stacked.index.FileInZipParser.start(SeZipFileParser.kt:187) ... 6 more Caused by: com.ctc.wstx.exc.WstxIOException: Write end dead at com.ctc.wstx.sr.StreamScanner.constructFromIOE(StreamScanner.java:640) at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:1004) at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:1043) at com.ctc.wstx.sr.StreamScanner.getNextChar(StreamScanner.java:789) at com.ctc.wstx.sr.BasicStreamReader.parseAttrValue(BasicStreamReader.java:1973) at com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3145) at com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:3043) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2919) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123) at org.codehaus.stax2.ri.Stax2EventReaderImpl.nextEvent(Stax2EventReaderImpl.java:255) at org.tools4j.stacked.index.XmlFileParser.parseElements(XmlFileParser.kt:37) ... 8 more Caused by: java.io.IOException: Write end dead at java.io.PipedInputStream.read(Unknown Source) at java.io.PipedInputStream.read(Unknown Source) at com.ctc.wstx.io.BaseReader.readBytes(BaseReader.java:155) at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:369) at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:112) at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:89) at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:998) ... 17 more 21:31:00.380 [Thread-15] ERROR org.tools4j.stacked.index.SeDirParser - Write end dead child number [52986815] in file [Comments.xml] whilst parsing archive [D:\StackOverflow\Website\stackoverflow.com-Comments.7z]
is this quickly solve able on my end by monday? Would REALLY appreciate it if you could look into it, it's needed ASAP and very important
No, it won't be fixed by Monday.
On Sat, 24 Aug 2019, 20:50 idbxy, notifications@github.com wrote:
is this quickly solve able on my end by monday?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tools4j/stacked-off/issues/1?email_source=notifications&email_token=AANM325W3NLZZ5TXXF7S3K3QGGGGZA5CNFSM4IO6YKUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CGLIQ#issuecomment-524576162, or mute the thread https://github.com/notifications/unsubscribe-auth/AANM325FZ626LDISPMK6AILQGGGGZANCNFSM4IO6YKUA .
Is there anything else I could try? or do you have any other options how to use stack overflow offline that you know off?
Not that I can think of, without looking at it further. Am camping right now so no computer access.
You could try using the other stack dump tool that is available.
On Sat, 24 Aug 2019, 21:06 idbxy, notifications@github.com wrote:
Is there anything else I could try?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tools4j/stacked-off/issues/1?email_source=notifications&email_token=AANM326ZQ2SBRTUHLFVPMJDQGGIDZA5CNFSM4IO6YKUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CGRWY#issuecomment-524576987, or mute the thread https://github.com/notifications/unsubscribe-auth/AANM32ZPXT7D3QXLIIA3PJ3QGGIDZANCNFSM4IO6YKUA .
Hello,
I moved the stacked-off folder to the C drive (primary drive) instead of the D drive, and moved the website folder (the rar, 7z data dumps) inside the stacked-off folder where the bin and lib folders are.
Had the idea after looking to some other similar projects that said to store something on the primary drive instead of other drives, and I thought it was worth giving a shot
previously it was D:\Foldername\Website D:\Foldername\StackedOff\Bin+Lib Now it's C:\StackedOff\Bin+Lib C:\StackedOff\Website
Until now this has resolved the issue and not having the dead child error as previously and I'm almost done indexing stack overflow
If I don't respond again, it means this solved the issue and I wanted you to know how.
so it might be an idea to change this line in the readme
Download the latest zip version from here, and unzip into your desired location.
into
Download the latest zip version from here, and unzip into your desired location on your primary drive (usually the C drive)
okay to confirm
it's fixed that way
any way to search with tags? like c++ tag only
thanks!
Ok, thanks for letting me know.
There's no way to specifically search for tags. But tags are included when u do a search. So if u search for c++ it should include questions which have c++ tag.
If for some reason c++ search isn't looking right, try using double quotes, eg "c++"
On Sun, 25 Aug 2019, 15:56 idbxy, notifications@github.com wrote:
okay to confirm
it's fixed that way
any way to search with tags? like c++ tag only
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tools4j/stacked-off/issues/1?email_source=notifications&email_token=AANM326ADFYQMNXTVYTRTTLQGKMQHA5CNFSM4IO6YKUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CVDEY#issuecomment-524636563, or mute the thread https://github.com/notifications/unsubscribe-auth/AANM324OYBNM25D3YU5JOL3QGKMQHANCNFSM4IO6YKUA .
I get an error when reaching 8GB out of 18GB of parsing the comments.xml
https://i.imgur.com/9AdCzm4.png
What's the cause of this? Is it possible to answer this quickly, I need it in 2 days before I leave