jbrzusto / find_tags

search raw data streams for patterns from registered tags
GNU General Public License v2.0
2 stars 0 forks source link

find_tags_motus spinning in Ambiguity::add() #59

Closed jbrzusto closed 6 years ago

jbrzusto commented 6 years ago

The call at Ambiguity.cpp:37:

abm.right.replace_data(i, s);

is not returning, but using high CPU. Both find_tags_motus runs where this happened were processing the activation event for motus tag 27790, and had been run with --resume.

jbrzusto commented 6 years ago

One of these sessions runs correctly without --resume:

find_tags_motus --pulses_to_confirm=8 --frequency_slop=0.5 --min_dfreq=0 --max_dfreq=12 --pulse_slop=1.5 --burst_slop=4 --burst_slop_expansion=1 --use_events --max_skipped_bursts=20 --default_freq=166.376 --bootnum=176 --src_sqlite /sgm_local/motus_meta_db.sqlite ~/test/SG-4001BBBK2600.motus

but running it with --resume exposes the bug.

jbrzusto commented 6 years ago

Arrgh... why am I not serializing the Ambiguity singleton? I must have thought I was rebuilding graphs (and hence the Ambiguity object) upon resume, which is not (any longer?) the case.

On second look, the ambiguity table is saved in the receiver DB, but only for those ambiguities actually detected. So if a Graph includes ambiguous tags which have not been detected at the time the session is paused, the ambiguity map is not saved, and upon restoring, the Graph contains dangling references to now undefined Ambiguities.

Options for Fixing

  1. save and resume the full ambiguity set via serialization
    • invalidates saved state (due to new objects being serialized)
    • moderate code churn
    • removes separate load/save functionality for ambiguities, subsuming it into pause/resume
  2. use the existing save/load scheme, independent of session pause/resume, but include all ambiguities, not just those with detections
    • invalidates saved state (we don't have a way to extract serialized references to undetected ambiguities)
    • minimal code churn
    • save/load functionality is conceptually different from pause/resume because the ambiguity map (between negative motus Tag IDs and tuples of ambiguous tags) is append-only: new pairings can be added to the map, but any which exist are there permanently.
  3. relocate the ambiguity table to the metadata DB
    • the ambiguity mapping could be globalized as each receiver is run, rather than when receiver data are merged into the master DB.
    • this requires wrapping ambiguity table queries into transactions to maintain consistency in ambiguity maps across tag-finder sessions running in parallel
    • moderate code churn, and slightly reduces independence of tag-finder sessions run in parallel

We'll go with 2, the simplest approach.

jbrzusto commented 6 years ago

For testing, the files used in the run whose state was being resumed above can be obtained like so:

sqlite>  select batchID, progName, monoBN, tsData, tsRun from batchState where monoBN=176;
batchID     progName         monoBN      tsData           tsRun          
----------  ---------------  ----------  ---------------  ---------------
145         find_tags_motus  176         1511361607.3858  1520016195.8292
sqlite> select fileID, name from files where fileID=(select max(fileID) from batchFiles where batchID=145);
fileID      name                                                           
----------  ---------------------------------------------------------------
17576       kingsburg-4001BBBK2600-000176-2017-11-22T14-31-10.6900T-all.txt

Also, check when completed:

jbrzusto commented 6 years ago

The first attempt at fixing isn't working. We still get spinning at the same tag event. Backtrace:

#0  link_point (inf=<synthetic pointer>, k=0x0, this=<optimized out>) at /usr/local/include/boost/multi_index/detail/ord_index_impl.hpp:1015
#1  replace_<boost::multi_index::detail::lvalue_tag> (variant=..., x=0x25bcf70, v=..., this=<optimized out>)
    at /usr/local/include/boost/multi_index/detail/ord_index_impl.hpp:807
#2  replace_ (x=0x25bcf70, k=..., this=<optimized out>) at /usr/local/include/boost/multi_index_container.hpp:802
#3  final_replace_ (x=0x25bcf70, k=..., this=0x699a60 <Ambiguity::abm+32>) at /usr/local/include/boost/multi_index/detail/index_base.hpp:269
#4  replace (x=..., position=..., this=0x699a60 <Ambiguity::abm+32>) at /usr/local/include/boost/multi_index/detail/ord_index_impl.hpp:388
#5  boost::bimaps::detail::map_view_base<boost::bimaps::views::map_view<boost::bimaps::relation::member_at::right, boost::bimaps::detail::bimap_core<std::set<Tag*, std::less<Tag*>, std::allocator<Tag*> >, Tag*, mpl_::na, mpl_::na, mpl_::na> >, boost::bimaps::relation::member_at::right, boost::bimaps::detail::bimap_core<std::set<Tag*, std::less<Tag*>, std::allocator<Tag*> >, Tag*, mpl_::na, mpl_::na, mpl_::na> >::replace_data<std::set<Tag*, std::less<Tag*>, std::allocator<Tag*> > > (
Python Exception <type 'exceptions.ValueError'> Cannot find type const std::set<Tag*, std::less<Tag*>, std::allocator<Tag*> >::_Rep_type: 
    this=this@entry=0x699a78 <Ambiguity::abm+56>, position=..., d=std::set with 1 elements) at /usr/local/include/boost/bimap/detail/map_view_base.hpp:151
#6  0x000000000040e08e in Ambiguity::add (t1=t1@entry=0x10c6240, t2=t2@entry=0xd7bb80) at Ambiguity.cpp:37
#7  0x000000000041c431 in Graph::addTag (this=0x10bd3b0, tag=tag@entry=0xd7bb80, tol=0.0015, timeFuzz=0.001, maxTime=84, timestamp_wonkiness=0) at Graph.cpp:55
#8  0x000000000042c888 in Tag_Foray::process_event (this=this@entry=0x7fffffffe040, e=...) at Tag_Foray.cpp:227
#9  0x000000000042d83d in Tag_Foray::start (this=this@entry=0x7fffffffe040) at Tag_Foray.cpp:186
#10 0x000000000040a8e5 in main (argc=<optimized out>, argv=<optimized out>) at find_tags_motus.cpp:774
(gdb) frame 6
#6  0x000000000040e08e in Ambiguity::add (t1=t1@entry=0x10c6240, t2=t2@entry=0xd7bb80) at Ambiguity.cpp:37
37        abm.right.replace_data(i, s); // alter the bimap
(gdb) p t1
$13 = (Tag *) 0x10c6240
(gdb) p t2
$14 = (Tag *) 0xd7bb80
(gdb) p *t1
$15 = {motusID = -27, freq = 166.3800048828125, dfreq = 0.373800009, gaps = std::vector of length 4, capacity 4 = {0.02197, 0.024410000000000001, 0.043950000000000003, 
    19.904169999999997}, period = 19.994499999999999, count = 0, mfgID = 2, codeSet = 4, active = true}
(gdb) p *t2
$16 = {motusID = 27790, freq = 166.3800048828125, dfreq = -0.514100015, gaps = std::vector of length 4, capacity 4 = {0.02197, 0.024410000000000001, 
    0.043950000000000003, 19.904770021362303}, period = 19.995100021362305, count = 0, mfgID = 2, codeSet = 4, active = false}
jbrzusto commented 6 years ago

~Perhaps the additional problem is addition of a 3rd tag to an ambiguity. That is supposed to work.~ It does work: running the full boot session without pause/resume succeeds.

There was also a problem with the query for calculating the nextID. DB_Filer::st_get_last_ambig had the arguments to coalesce reversed, so was always returning -1.

jbrzusto commented 6 years ago

In the end, we went with a combination of fix options 1 and 2. The map between proxy tags and the sets of ambiguous tags they represent is handled in tandem: