kputnam / piggly

PL/pgSQL stored procedure code coverage tool
Other
69 stars 14 forks source link

No tag with id X (RuntimeError) #43

Closed villelahdenvuo closed 6 years ago

villelahdenvuo commented 6 years ago

I made a script to run piggly:

#!/usr/bin/env bash

piggly trace -s '/cas\./'

pg_prove --ext .sql -h postgres -q -S client_min_messages='warning' 2> coverage.txt

echo "#### COVERAGE"
cat coverage.txt
echo "####"

piggly report -s '/cas\./' -f coverage.txt

piggly untrace -s '/cas\./'

But it gives me the following error:

compiling 1 procedures
ProcessQueue running concurrently
Compiling cas.update_timestamp
tracing 1 procedures
Compiling cas.update_timestamp
t/user.sql .. ok
All tests successful.
Files=1, Tests=2,  0 wallclock secs ( 0.03 usr  0.01 sys +  0.02 cusr  0.04 csys =  0.10 CPU)
Result: PASS
#### COVERAGE
psql:t/user.sql:10: WARNING:  PIGGLY 9602c614b41623a5
psql:t/user.sql:10: WARNING:  PIGGLY 4a573c5ae5aa22f9
####
Compiling cas.update_timestamp
clearing previous coverage
/var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/profile.rb:25:in `[]': No tag with id 9602c614b41623a5 (RuntimeError)
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/profile.rb:35:in `ping'
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/profile.rb:100:in `block in notice_processor'
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/command/report.rb:68:in `block in read_profile'
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/command/report.rb:68:in `each'
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/command/report.rb:68:in `read_profile'
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/command/report.rb:37:in `main'
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/command/base.rb:15:in `main'
        from /var/lib/gems/2.3.0/gems/piggly-2.2.4/bin/piggly:8:in `<top (required)>'
        from /usr/local/bin/piggly:22:in `load'
        from /usr/local/bin/piggly:22:in `<main>'
restoring 1 procedures

The generated Trace code is:

BEGIN perform public.piggly_branch($PIGGLY$2a293b29b5a03a84$PIGGLY$);NEW.updated_at = now(); perform public.piggly_branch($PIGGLY$dba45fbf3b18e0d4$PIGGLY$);RETURN NEW; END;

So it seems like the IDs indeed don't match! The piggly trace output looks suspicious because it's saying Compiling cas.update_timestamp twice, maybe that's the reason? But why?

kputnam commented 6 years ago

Indeed that is strange. Does pg_prove happen to define any procedures (unlikely to be the cause)? Do you possibly have two definitions for cas.update_timestamp, maybe each takes different arguments or argument types? I'll peruse the code this week and see if we can figure it out.

villelahdenvuo commented 6 years ago

pg_prove doesn't define any procedures as far as I know. I do have pgTAP installed to the public schema, but they are skipped thanks to the -s '/cas\./' selector. I have tried deleting the cache and re-running it without the filtering, but the same problem ensues. I just kept the filtering to reduce the log output for the example.

kputnam commented 6 years ago

Hmm, I can only come up with two theories: maybe somewhere during the process, the current working directory is changed so piggly doesn't find a file that it just wrote that would otherwise indicate a stored procedure was compiled. I haven't looked closely to see if this is possible, but it seems unlikely. Could you confirm that piggly trace -s '/\.cas/' alone, without doing steps afterward, prints the message "Compiling cas.update_timestamp" twice?

I think more likely, the SQL query used to find stored procedures could be returning a duplicate record, perhaps due to a missing JOIN constraint. Could you check if that's the case with this query (do you see two rows for cas.update_timestamp?):

select
  pro.oid,
  nschema.nspname   as nschema,
  pro.proname       as name,
  pro.proisstrict   as strict,
  pro.prosecdef     as secdef,
  pro.provolatile   as volatility,
  pro.proretset     as setof,
  rschema.nspname   as tschema,
  ret.typname       as type,
  pro.prosrc        as source,
  pro.pronargs      as arg_count,
  array_to_string(pro.proargmodes, ',') as arg_modes,
  array_to_string(pro.proargnames, ',') as arg_names,
  case when proallargtypes is not null then
         -- use proalltypes array if its non-null
         array_to_string(array(select format_type(proallargtypes[k], null)
                               from generate_series(array_lower(proallargtypes, 1),
                                                    array_upper(proallargtypes, 1)) as k), ',')
       else
         -- fallback to oidvector proargtypes
         oidvectortypes(pro.proargtypes)
       end as arg_types,
  pro.pronargdefaults as arg_defaults_count,
  coalesce(pg_get_expr(pro.proargdefaults, 0), '') as arg_defaults
from pg_proc as pro,
     pg_type as ret,
     pg_namespace as nschema,
     pg_namespace as rschema
where pro.pronamespace = nschema.oid
  and ret.typnamespace = rschema.oid
  and pro.proname not like 'piggly_%'
  and pro.prorettype = ret.oid
  and pro.prolang = (select oid from pg_language where lanname = 'plpgsql')
  and pro.pronamespace not in (select oid
                               from pg_namespace
                               where nspname like 'pg_%'
                                  or nspname like 'information_schema')
order by 1, 2;
villelahdenvuo commented 6 years ago

I will try that and also try to make a minimal repro so you can try it on your machine. 👍

villelahdenvuo commented 6 years ago

Indeed running piggly trace compiles the procedure twice:

postgres@b304881786e3:~$ piggly trace -s '/cas\./'
compiling 1 procedures
ProcessQueue running concurrently
Compiling cas.update_timestamp
tracing 1 procedures
Compiling cas.update_timestamp
postgres@b304881786e3:~$

The procedure should only exist once:

postgres=# \df cas.*
                               List of functions
 Schema |       Name       | Result data type  | Argument data types |  Type
--------+------------------+-------------------+---------------------+---------
 cas    | update_timestamp | trigger           |                     | trigger
 cas    | user_full_name   | character varying | u cas."user"        | normal
 cas    | user_groups      | SETOF cas."group" | userobj cas."user"  | normal
(3 rows)

The SQL also only lists the procedure once. Very strange.

56147 | cas | update_timestamp | f | f | v | f | pg_catalog | trigger | BEGIN NEW.updated_at = now(); RETURN NEW; END;

villelahdenvuo commented 6 years ago

@kputnam I made the minimal repro case here:

git clone https://gist.github.com/9f98370df6a7faa5ae999c96b669acc1.git issue42
cd issue42
chmod +x *.sh # Gist probably doesn't save file permissions?
./repro.sh

You need to have Docker installed, it should work on Mac and probably on Linux, too. Hopefully this can help you debug the problem. If you need any additional information I can try to help you.

villelahdenvuo commented 6 years ago

I was playing with the repro code and I found out something interesting. Whenever I start the container and run the test script it fails, but if I keep running the test script multiple times after some random amount of failures it suddenly starts working! And after that it works every time I run it. Very peculiar!

That is on the first runs it compiles the procedure twice for no apparent reason, but after running the script multiple times (with fresh database, but same container instance running, I added schema drop cascade cas as a cleanup step), it starts working correctly and only compiles the procedure once.

kputnam commented 6 years ago

Thanks, this is very helpful! Yes, I'm seeing the problem intermittently too. I'll look into it and hopefully fix it soon.

kputnam commented 6 years ago

I'm not sure why this happens yet, but the recompile is caused by the compiler output being older than some of the source files in piggly. This is intentional, so if you upgrade piggly it will force a recompile.

Here's what I see when I added a message to Piggly::Util::File.stale?

compiling 1 procedures
ProcessQueue running concurrently
/home/postgres/piggly/cache/Trace/74aa23b825898b1ac0d1cf574a5bad3d.plpgsql (2018-04-20 02:52:29 +0000) is older than /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/parser/grammar.tt (2018-04-20 02:59:41 +0000)
Compiling cas.update_timestamp
tracing 1 procedures
/home/postgres/piggly/cache/Trace/74aa23b825898b1ac0d1cf574a5bad3d.plpgsql (2018-04-20 02:59:50 +0000) is older than /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/parser/parser.rb (2018-04-20 02:59:51 +0000)
Compiling cas.update_timestamp
INSERT 0 1
UPDATE 1
####
WARNING:  PIGGLY 751be36a36b09700
WARNING:  PIGGLY ca1fbc5a9a0865ab
####
/home/postgres/piggly/cache/Trace/74aa23b825898b1ac0d1cf574a5bad3d.plpgsql (2018-04-20 02:59:50 +0000) is older than /var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/parser/parser.rb (2018-04-20 02:59:51 +0000)
Compiling cas.update_timestamp
clearing previous coverage
/var/lib/gems/2.3.0/gems/piggly-2.2.4/lib/piggly/profile.rb:25:in `[]': No tag with id 751be36a36b09700 (RuntimeError)

I'm going to see if mtime is acting unexpectedly. This is my first time using Docker, so I'm not sure if I should suspect anything there. I'll try to reproduce it on another Linux machine, since I've used piggly only on macOS for quite a while and maybe this is a latent bug.

kputnam commented 6 years ago

Ah, it seems to be Docker-related. Can you try adding -c /tmp to each invocation of piggly in test.sh? That will make piggly store its compiler outputs in the container (not on a mounted directory that belongs to the host). Here's what I have:

#!/usr/bin/env bash
set -ueo pipefail

# Wait for DB init.
sleep 5

# Run piggly
export PGHOST=postgres
psql -c 'CREATE SCHEMA cas;'
psql -c 'CREATE FUNCTION cas.update_timestamp() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = now(); RETURN NEW; END; $$ LANGUAGE plpgsql'
psql -c 'CREATE TABLE cas.test (id INTEGER PRIMARY KEY, name TEXT, updated_at TIMESTAMPTZ);'
psql -c 'CREATE TRIGGER tr_update_timestamp BEFORE UPDATE ON cas.test FOR EACH ROW EXECUTE PROCEDURE cas.update_timestamp();'

piggly trace -s '/cas\./' -c /tmp/piggly -o piggly/reports

psql -c "INSERT INTO cas.test (id, name) VALUES (1, 'foobar');"
psql -c "UPDATE cas.test SET name = 'barfoo' WHERE id = 1;" 2> coverage.txt

piggly report -s '/cas\./' -f coverage.txt -c /tmp/piggly -o piggly/reports
piggly untrace -s '/cas\./' -c /tmp/piggly
villelahdenvuo commented 6 years ago

@kputnam Indeed, that fixed it! What a strange issue. I can only mount the reports folder and that should solve the problem. However even if piggly recompiles the procedures, shouldn't it then use the latest tags and just work? Anyway with this information I can make my setup work, thanks a lot for the debugging help!

kputnam commented 6 years ago

Glad it worked! My guess is the second recompile wasn't a problem, but the third one that's triggered by piggly report, after the coverage has been logged, is the culprit. It probably doesn't even make sense to recompile there, even if something looks stale -- but not doing it potentially creates problems too (newer version of piggly or treetop might not be backward compatible with the parse tree generated by an earlier version). Anyway, glad it worked!