matt-gardner / pra

122 stars 42 forks source link

PRA ArrayIndexOutOfBoundsException #22

Closed penglin03 closed 7 years ago

penglin03 commented 7 years ago

Hello Matt,

We have successfully ran SFE. However, when we were running PRA, an ArrayIndexOutOfBounds problem occurs. See below. So would you please have a look about this? I appreciate it.

Thank you in advance!

""" Computing feature values 9:48:59 PM walk-manager - t:1 INFO: Initial size for walk bucket: 32 9:48:59 PM path-follower execute - t:1 INFO: Creating feature matrix using random walks 9:48:59 PM long-walk-manager initializeWalks - t:1 INFO: Calculate sizes. Walks length:3574 Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1786 at edu.cmu.graphchi.walks.LongWalkManager.initializeWalks(LongWalkManager.java:150) at edu.cmu.graphchi.walks.DrunkardDriver.initWalks(DrunkardDriver.java:98) at edu.cmu.graphchi.walks.DrunkardMobEngine.run(DrunkardMobEngine.java:147) at edu.cmu.ml.rtw.pra.features.RandomWalkPathFollower.execute(RandomWalkPathFollower.java:129) at edu.cmu.ml.rtw.pra.features.PraFeatureGenerator.computeFeatureValues(PraFeatureGenerator.scala:124) at edu.cmu.ml.rtw.pra.features.PraFeatureGenerator.createTestMatrix(PraFeatureGenerator.scala:52) at edu.cmu.ml.rtw.pra.operations.TrainAndTest.runRelation(Operation.scala:108) at edu.cmu.ml.rtw.pra.experiments.Driver$$anonfun$_runStep$1.apply(Driver.scala:93) at edu.cmu.ml.rtw.pra.experiments.Driver$$anonfun$_runStep$1.apply(Driver.scala:87) at scala.collection.immutable.Stream.foreach(Stream.scala:594) at edu.cmu.ml.rtw.pra.experiments.Driver._runStep(Driver.scala:87) at com.mattg.pipeline.Step.runStep(Step.scala:187) at com.mattg.pipeline.Step._runPipeline(Step.scala:152) at com.mattg.pipeline.Step.runPipeline(Step.scala:79) at edu.cmu.ml.rtw.pra.experiments.ExperimentRunner$.runPraFromSpec(ExperimentRunner.scala:76) at edu.cmu.ml.rtw.pra.experiments.ExperimentRunner$$anonfun$runPra$1.apply(ExperimentRunner.scala:61) at edu.cmu.ml.rtw.pra.experiments.ExperimentRunner$$anonfun$runPra$1.apply(ExperimentRunner.scala:61) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at edu.cmu.ml.rtw.pra.experiments.ExperimentRunner$.runPra(ExperimentRunner.scala:61) at edu.cmu.ml.rtw.pra.experiments.ExperimentRunner$.main(ExperimentRunner.scala:31) at edu.cmu.ml.rtw.pra.experiments.ExperimentRunner.main(ExperimentRunner.scala) """

matt-gardner commented 7 years ago

What data are you running this on? It looks like you're somehow getting a negative number for a source id here: https://github.com/matt-gardner/pra/blob/d54f83d54bcb0d29e196869bcc001deb16ad56d3/src/main/java/edu/cmu/ml/rtw/pra/features/RandomWalkPathFollower.java#L120

I'm not sure how this is possible. I also don't have a whole lot of bandwidth to help you figure it out, unfortunately. I'd probably start by figuring out whether you actually are getting negative source ids, and where they come from if you are.

penglin03 commented 7 years ago

Thanks Matt. I use both your dataset and my dataset (processed Yago).

Based on your hint, I printed the sourcesMap, and it did have one negative ID, which was "-1". See below.

I also looked into the folder generated inside "/graphs", there are several files, like edge_dict.tsv, node_dict.tsv and "/graph_chi/edges.tsv", all of them have positive IDs.

Are there any other way to identify the problem?

Or, can we do some workaround to bypass this issue? Like checking if an instance is in the graph.

    for (NodePairInstance instance : instances) {
      if (instance.isInGraph()) {
        MapUtil.addValueToKeySet(sourcesMap, instance.source(), instance.target());
      }
    }

Thanks again.

sourcesMap =

{-1=[69985, 172435, 285987], 312837=[117919], 293893=[82813], 217604=[131220, 226620], 146949=[399849], 408065=[407357], 34311=[117493], 376844=[162274], 432141=[139057], 333333=[144], 449557=[66492], 115218=[141577], 152593=[340626], 78869=[444542], 282640=[181523], 320029=[70092], 346148=[181875], 111136=[328984], 448544=[248724], 277027=[147060], 278051=[63772], 208939=[103830], 266798=[423597], 448554=[-1], 193074=[297498], 151601=[50507], 423475=[202629], 306738=[381545], 229428=[121387], 421425=[26805], 15418=[62951], 222268=[173456], 363589=[111013], 287813=[158317], 395330=[73515], 62532=[74840], 416321=[323371], 189002=[255828], 431695=[166134], 451660=[98784], 412749=[253221], 96332=[292037], 411723=[247304], 415816=[220612], 309329=[205202], 94807=[435083], 20568=[338262], 141410=[278510], 121960=[67722], 64617=[380514], 206954=[107204], 111214=[62820, 194857, 260584], 114800=[242582], 190065=[87034], 190077=[356119], 54400=[207908], 311424=[309710], 24196=[290673], 135814=[384719], 265857=[82126], 450177=[362966], 6792=[19000], 229512=[91657], 427147=[271655], 409224=[152582], 418952=[16343], 258203=[67222], 62617=[65998], 254106=[49166], 114840=[155366], 390814=[53067], 149662=[237419], 106653=[401326], 251043=[265117], 102049=[217304], 178858=[94619], 365228=[257306], 411821=[324907], 195753=[452827], 87728=[66367], 102067=[398810], 192688=[53132], 10930=[256877], 354998=[420634], 391858=[390993], 357564=[76865], 335548=[150650], 232632=[172393], 250046=[187774], 205502=[32830], 393915=[7895], 164028=[204417], 250049=[211040], 131275=[127826], 421069=[-1], 284872=[12346], 199886=[450665], 125134=[286744], 447190=[367698], 406229=[48085], 335062=[43154], 132310=[166517], 196823=[239646], 30422=[165739], 344796=[199247], 34010=[342763], 325337=[151147], 75486=[264514], 261347=[298667], 415974=[166969], 126688=[416504], 242402=[13958], 403172=[51726], 400610=[134505], 434403=[300785], 9965=[111859], 87280=[291543], 87282=[316199], 320247=[240514], 188150=[102002], 431359=[293527], 329468=[86337], 392443=[60906], 306438=[40286], 455940=[110240], 422146=[147236], 214794=[102897], 102667=[48418], 384270=[327124], 312072=[435319], 370440=[204530], 228621=[47673], 292619=[131552], 343306=[229951], 84247=[150951], 160540=[-1, 169468], 107294=[422491], 443172=[234678], 125737=[420343], 385836=[162841], 32554=[336091], 269615=[411137], 191790=[18874], 244527=[76134], 440619=[248726], 440107=[358612], 285492=[42327], 386357=[89037], 244529=[175650], 335154=[158568], 208187=[336442], 294718=[301860, 136288, 157483, 210922, 110218, 84301, 216719, 84300, 97777, 148243, 271665, 55541, 367837, 368765, 131643, 292568, 51326, 3134, 108926, 292571], 401722=[116809], 321863=[66536], 408386=[264742], 250182=[171505], 129350=[99241], 450369=[160723], 109384=[243365], 421199=[334780], 125771=[83119], 149324=[148554], 286539=[94032], 277844=[122943], 55632=[26240], 322385=[335188], 166741=[451253, 85100], 415578=[215337], 399193=[267663], 244579=[224536], 272742=[112825], 235361=[76899], 315238=[346760], 417645=[407123], 175468=[13441, 337840, 175876, 108923, 222558], 244588=[43488], 388983=[278205], 313724=[325998], 98681=[455793], 361340=[202463], 218489=[242448], 439163=[398909], 440187=[339142], 89983=[277463], 219005=[181158], 154492=[133332], 24960=[174623], 437637=[24109], 279424=[88862], 174469=[337063], 209289=[434339], 132489=[284665], 406922=[95168], 308104=[325367], 193932=[63394], 407432=[264133], 17808=[225670], 99729=[31233], 224149=[20040], 56727=[203551], 343453=[152522], 319900=[309902], 82329=[292901], 413597=[115978], 251288=[446541], 200604=[291497], 270247=[428591], 454051=[19567], 127399=[321469], 440751=[56235], 426924=[7266], 389551=[212881], 257448=[105151], 282537=[211603], 300457=[73179], 411062=[428349], 241591=[451710], 279485=[410484], 347583=[68214], 431547=[354455], 426937=[74812], 116677=[168829], 239050=[454257], 193998=[154607], 426954=[439206], 280532=[198587], 334292=[173164], 12241=[265039], 235985=[4505], 427477=[29308], 325072=[394314], 426451=[-1], 369619=[246885], 120791=[98597], 161755=[250069], 122843=[173350], 388569=[291448], 262116=[183010], 13799=[246120], 300524=[108627], 164840=[88291], 381422=[226171], 44524=[286000], 160755=[325712], 33780=[281113], 422898=[380290], 433648=[5802], 363519=[131657], 206841=[248718], 421373=[299752], 416253=[288600], 417274=[48784]}

matt-gardner commented 7 years ago

We need to know the content of sources, not sourcesMap - try printing out the original source and the translated source here: https://github.com/matt-gardner/pra/blob/d54f83d54bcb0d29e196869bcc001deb16ad56d3/src/main/java/edu/cmu/ml/rtw/pra/features/RandomWalkPathFollower.java#L100. It could be that the graphchi version of your graph is older and has fewer nodes than the rest of the code is expecting, and so graphchi is getting confused and using negative numbers. Or maybe it's just -1 that's getting translated to -1786, and it should just be removed.

penglin03 commented 7 years ago

Printed. "-1" has ben translated to "-228690". I think "-1786" is the number divided by bucket size in GraphChi. So we can just remove the only one -1, right?

    /* Precalculate bucket sizes for performance */
    int[] tmpsizes = new int[walks.length];
    for(int j=0; j < sourceSeqIdx; j++) {
        int source = sources[j];
        tmpsizes[source / bucketSize] += sourceWalkCounts[j];
    }

Computing feature values source: -1, transSource: -228690 source: 312837, transSource: 385108 source: 293893, transSource: 375636 source: 217604, transSource: 108802 source: 146949, transSource: 302164 source: 408065, transSource: 432722 source: 34311, transSource: 245845 source: 376844, transSource: 188422 source: 432141, transSource: 444760 source: 333333, transSource: 395356 source: 449557, transSource: 453468 source: 115218, transSource: 57609 source: 152593, transSource: 304986 source: 78869, transSource: 268124 source: 282640, transSource: 141320 source: 320029, transSource: 388704 source: 346148, transSource: 173074 source: 111136, transSource: 55568 source: 448544, transSource: 224272 source: 277027, transSource: 367203 source: 278051, transSource: 367715 source: 208939, transSource: 333159 source: 266798, transSource: 133399 source: 448554, transSource: 224277 source: 193074, transSource: 96537 source: 151601, transSource: 304490 source: 423475, transSource: 440427 source: 306738, transSource: 153369 source: 229428, transSource: 114714 source: 421425, transSource: 439402 source: 15418, transSource: 7709 source: 222268, transSource: 111134 source: 363589, transSource: 410484 source: 287813, transSource: 372596 source: 395330, transSource: 197665 ...... ...... ......

matt-gardner commented 7 years ago

Yeah, you should be able to just remove the -1. It's there because you have a node in your training data that doesn't show up in your graph - you might want to check why that's happening, as it might be a bug somewhere in your pipeline. But removing the -1 should fix this particular crash.

penglin03 commented 7 years ago

Okay. I will remove -1. Thanks.

P.S. I don't think it is my pipeline/data problem, since PRA also crashed by -1 index with your settings and NELL data. So I am not sure if you could reproduce it on your side. My friend also had this problem.

Thanks again for your help!