woc-hack / tutorial

Other
47 stars 20 forks source link

b2ta vs b2tP #58

Closed mahmoudjahanshahi closed 7 months ago

mahmoudjahanshahi commented 8 months ago

There seems to be some inconsistency between these two tables. Specifically, there are blob-time combinations in b2tP that cannot be found in b2ta. For instance:

# on ISAAC-NG
> zcat /nfs/home/audris/work/c2fb/b2tPFullV0.s | grep -e "^000292eaf94094c2db30749420e25bd8f0cd42b1;1477142560;" -e "^000292eaf94094c2db30749420e25bd8f0cd42b1;1483708128;"
000292eaf94094c2db30749420e25bd8f0cd42b1;1477142560;osteffen_epics
000292eaf94094c2db30749420e25bd8f0cd42b1;1483708128;ukaea_epics
# Or:
> zcat /nfs/home/audris/work/c2fb/b2tPFullV0.s | grep -e "^00037ec437989bd7edb68cf3110f5d350825bba4;1235435297;" -e "^00037ec437989bd7edb68cf3110f5d350825bba4;1430705448;"
00037ec437989bd7edb68cf3110f5d350825bba4;1235435297;epics-modules_motor
00037ec437989bd7edb68cf3110f5d350825bba4;1430705448;bitbucket.org_whitegr_atf2-flight-simulator

As we can see, these blob-times can be found on b2tP tables. But if we try b2ta, these blob-time combinations cannot be found:

# on da
> zcat /da?_data/basemaps/gz/b2taFullV0.s | grep -e "^000292eaf94094c2db30749420e25bd8f0cd42b1;1477142560;" -e "^000292eaf94094c2db30749420e25bd8f0cd42b1;1483708128;"
no result!
> zcat /da?_data/basemaps/gz/b2taFullV0.s | grep -e "^00037ec437989bd7edb68cf3110f5d350825bba4;1235435297;" -e "^00037ec437989bd7edb68cf3110f5d350825bba4;1430705448;"
no result!
audrism commented 8 months ago

Here is a way to debug

First find P

zcat /da?_data/basemaps/gz/b2tPFullV0.s | grep 000292eaf94094c2db30749420e25bd8f0cd42b1  | grep 1477142560
000292eaf94094c2db30749420e25bd8f0cd42b1;1477142560;osteffen_epics

Now find commit

echo osteffen_epics | ~/lookup/getValues -f P2c| cut -d\; -f2 | ~/lookup/getValues c2dat | grep 1477142560
20556240f157c6b812d66e900d119e0db3bce391;1477142560;+0200;Oliver Steffen <olisteffen@posteo.de>;3f5a023665d2fb9608f1789edcd7ac73db1304aa;34e2211e4c7753050070ea1b4c455c4bfa2b5ac3

Finally verify that commit created blob

echo 20556240f157c6b812d66e900d119e0db3bce391 | ~/lookup/cmputeDiff3.perl | grep  000292eaf94094c2db30749420e25bd8f0cd42b1  
20556240f157c6b812d66e900d119e0db3bce391;/base-3.15.5-pre1/src/ca/legacy/pcas/generic/ioBlocked.h;000292eaf94094c2db30749420e25bd8f0cd42b1;

b2tP is correct,

See the calculation workflow via

dotty dep.dot

b2tP is derived from c2P, c2dat, and c2fbb in b2ob.slurm

while

b2ta from c2fbb and c2dat in b2ob.slurm

...
zcat c2fbbFull$ver$j.s | cut -d\; -f 1,3 |  join -t\; - <(zcat ../gz/c2datFull$ver$j.s| cut -d\; -f1-2,4)
done | perl -ane 'chop();@x=split(/;/, $_, -1); print "$x[1];$x[2];$x[3];$x[0]\n" if "$x[0];$x[1]" =~ m/^[0-9a-f]{40};[0-9a-f]{40}$/;' | perl -I $HOME/lib/perl5 -I $HOME/lookup $HOME/lookup/splitSec.perl b2taFull$ver.$l. 128

This also works, so there must be some issue processing data: will look into it

audrism commented 7 months ago

Created all as b2taFullV.s1 on isaac.utk.edu, copied to /da7_data/basemaps/gz now (as b2taFullV.s)

One potential reason is that join silently crashes when you have several million identical keys.

The random lookup tables are b2tac (as it is supposed to be)