OraOpenSource / oos-utils

Common PL/SQL utility scripts
MIT License
203 stars 73 forks source link

Transaction Number Generator #173

Open martindsouza opened 6 years ago

martindsouza commented 6 years ago

Help Wanted!

Please read through this and need help to determine what to call these function(s)

Problem

Sometimes business users want a transaction number. Ex: invoice number.

What most people do is create a sequence and then pad it with some 0s. Ex: 0081. The problem with it is what happens when we hit 9999? We'll then have 4 character and 5 character transaction numbers.

A simple solution is to covert a sequence to hex , thus giving transaction numbers like A08F. Though this helps, in high transaction systems adding 6 additional characters isn't enough or requires that transaction numbers are purposely long. See table below for stats

The following table shows how many values you can get for the number of characters:

Characters Base 10 Base 16 (Hex) Base 36 (0-Z)
1 10 16 36
2 100 256 1296
3 1000 4096 46656
4 10000 65536 1679616
5 100000 1048576 60466176
6 1000000 16777216 2176782336
7 10000000 268435456 78364164096
8 100000000 4294967296 2821109907456
9 1000000000 68719476736 101559956668416
10 10000000000 1099511627776 3656158440062976

Solution

The proposed solution is to allow for transaction numbers that cover 0-Z. I.e. 0,1,...9,A,B...Z for each character (base 36). We could expand this in the future to go beyond base 36 but would need to defined what the 37th character would look like.

The following query converts a number to base 36:

with 
  lvls as (
    select level lvl
    from dual
    -- The <= logic will return the number of characters required for conversion
    connect by level <= ceil(log(:base, :x)) + decode(log(:base, :x), ceil(log(:base, :x)), 1,0)
  ),
  -- Alphabet 0..Z
  alphabet as (
    select 
      level-1 num,
      case
        when level-1 < 10 then to_char(level-1)
        else chr( ascii('A')+level-1-10)
      end letter
    from dual
    connect by level <= :base
  ),
  -- Returns rows for all the decimal values for each character position
  my_data as (
    select
      to_char(:x, 'XXXXXX') hex_val, -- for testing
      lvl,
      remainder,
      quotient
    from lvls
    model
      return all rows
      dimension by (lvl)
      measures( 0 remainder, 0 quotient)
      rules 
      ( 
        -- Order matters here. I.e. s must come after t so s can "see" t
        quotient[lvl] = trunc(nvl(quotient[cv(lvl)-1], :x) / :base),
        remainder[lvl] = mod(nvl(quotient[cv(lvl)-1], :x), :base)

      )
  )
select
  to_char(:x, 'XXXXXX') hex_conv, -- to test for hex
  listagg(a.letter, '') within group (order by md.lvl desc) basex
from my_data md, alphabet a
where 1=1
  and md.remainder = a.num
-- For testing
--select *
--from my_data
;

PL/SQL version:

create or replace function basex (
  p_num in integer,
  p_base in integer)
  return varchar2
as
  l_return varchar2(255);
  l_quotient integer;
  l_remainder integer;
begin

  -- TODO mdsouza: checks that p_num > = 0 and p_base bwteen 10 and 36

  l_quotient := p_num;

  while l_quotient > 0 loop
    l_remainder := mod(l_quotient, p_base);
    l_quotient := trunc(l_quotient / p_base);

    if l_remainder < 10 then
      l_return := to_char(l_remainder) || l_return;
    else
      -- Subtract -10 since 0~10 covered in above
      l_return := chr(ascii('A') + l_remainder - 10) || l_return;
    end if;
  end loop;

  return l_return;
end basex;
/

Tasks

jeffreykemp commented 6 years ago

I would call it to_base36.

connormcd commented 6 years ago

I found that enumerating the symbols in advance, and exchanging the MOD for subtraction gives a little perf boost. Around 15% on my machine.

create or replace function basex2 (
  p_num in integer,
  p_base in integer)
  return varchar2
as
  l_symbols      varchar2(64) := '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
  l_return       varchar2(255);
  l_quotient     integer := p_num;
  l_trunced_part integer;
  l_remainder    integer;
begin

  while l_quotient > 0 loop
    l_trunced_part := trunc(l_quotient / p_base);
    l_remainder    := l_quotient - l_trunced_part*p_base;
    l_quotient     := l_trunced_part;

    l_return := substr(l_symbols,l_remainder+1,1) || l_return;
  end loop;

  return l_return;
end;
zhudock commented 6 years ago

67 and #128 both reference base conversion as well. I realize the goal of this issue is to add some additional functionality on top the base conversion, but we should avoid duplicating any code.

zhudock commented 6 years ago

In regards to @dmcghan suggestion on Twitter, this file from PWGen has the list of ambiguous characters they use.

http://pwgen.cvs.sourceforge.net/viewvc/pwgen/src/pw_rand.c?view=markup

const char *pw_ambiguous = "B8G6I1l0OQDS5Z2";

I'd suggest to add lowercase oisz, as well but excluding everything from that list may be a bit too aggressive.