loda-lang / loda-cpp

Runtime and miner for the LODA language written in C++
https://loda-lang.org/
Apache License 2.0
21 stars 1 forks source link

A053833 - Skipped submission because there exists already a (faster) program #106

Closed neoneye closed 2 years ago

neoneye commented 2 years ago
PROMPT> loda submit A053833 myprogram.asm
2022-02-08 18:55:38|INFO |Starting LODA v22.2.8. See https://loda-lang.org/
2022-02-08 18:55:38|INFO |Using LODA home directory “/Users/neoneye/loda/”
2022-02-08 18:55:38|INFO |Loading sequences from the OEIS index
2022-02-08 18:55:44|INFO |Loaded 330060/350980 sequences in 6.39s
2022-02-08 18:55:52|INFO |Initialized 5 matchers (ignoring 1440 sequences)
2022-02-08 18:55:52|INFO |Validating program for A053833
2022-02-08 18:55:52|INFO |Validation successful
2022-02-08 18:55:52|INFO |Skipped submission because there exists already a (faster) program
PROMPT>

My program:

; A053833: Sum of digits of n written in base 13.
; Submitted by Simon Strandgaard
; 0,1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,5,6,7,8,9,10,11,12,13,2,3,4,5,6,7,8,9,10,11,12,13,14,3
lpb $0
  mov $2,$0
  mod $2,13
  add $1,$2
  div $0,13
lpe
mov $0,$1

I think the submit failed because A053833 has existed in the past.

If I delete the ~/loda/stats dir and regenerate it using loda mine. Then programs.csv contains this row: 53833,0,1,0. It seems to continue to have knowledge that A053833 has existed in the past. I tried deleting the row in programs.csv, but submit still fails the same way.

ckrause commented 2 years ago

It is a combination of multiple problems:

  1. The miner uses the first 10 terms to narrow down possible matches
  2. The sequence starts with 0,1,2,3,4,5,6,7,8,9 -- a highly trivial sequence
  3. There is a heuristic in the matcher that avoids large amounts of matches with the same sequence start:

https://github.com/loda-lang/loda-cpp/blob/main/src/matcher.cpp#L65-L69

The heuristic is important for performance. Otherwise the miner would spend too much time for sequences wth trivial intial terms. Also using more than 10 initial terms is not really an option because there are LOTS of sequences with >=10 but <20 terms. There is no easy fix for this. Not sure how to solve it. We can add the program directly to the program repo, but it doesn't solve the matcher problem.

ckrause commented 2 years ago

Submitting this program works now, but the server accepts it only if it runs with backoff: false. Currently we don't do this because it is much slower. I added A053833 manually to the programs repo:

https://github.com/loda-lang/loda-programs/blob/main/oeis/053/A053833.asm