gbdev / rgbds

Rednex Game Boy Development System - An assembly toolchain for the Nintendo Game Boy and Game Boy Color
https://rgbds.gbdev.io
MIT License
1.34k stars 172 forks source link

Should PURGE-ing refs be allowed? #492

Closed ISSOtm closed 4 years ago

ISSOtm commented 4 years ago
SECTION "hm", ROM0
  db Label
  PURGE Label

This will work fine, unless called with a -o param, in which case we'll get a use-after-free.

The UAF will be fixed anyways, but still, should PURGE be allowed to work on symbol refs? I think not, because while refs are technically symbols in RGBASM for logistical reasons, they don't define anything per se, and thus it doesn't make sense to delete them.

By extension, should deleting a symbol referenced in a patch be allowed? Example:

SECTION "hm", ROM0
  db Label ; Non-constant, emits a patch
Label:
  PURGE Label

If it is allowed, should that turn the label into a ref instead?

aaaaaa123456789 commented 4 years ago

There are good reasons to want to define a symbol twice, but it's probably not going to be easy.

The problem:

SECTION "Test", ROM0[0]

  dw Test
Test:
  dw Test
PURGE Test
  dw Test
Test:
  dw Test
PURGE Test
Test::
  dw Test

What we probably want here to be emitted is 08 00 02 00 08 00 06 00 08 00. But how to be sure? And how to make this not break existing codebases?

Possible solutions:

There's a few ways to solve this problem. None are entirely satisfactory as I see them, but it would probably be better to just pick one and declare it to be the defined behavior:

  1. Just ban this. Probably the simplest approach in the long run, but also the most restrictive.
  2. Use lexical scope for non-exported labels only. This would ensure that the problem doesn't leave the domain of RGBASM. However, this also means that a label cannot be used before declaring it unless it is declared as exported (Foo::). Perhaps a worthwhile exception can be made for local labels, but this still has the chance of breaking existing codebases.
  3. Special-case the one-declaration behavior. If a label is declared once, it works the same way it has worked so far. If a label is declared multiple times, use lexical scoping. If a label is declared in multiple translation units, any external references to it should fail. This basically gives the most likely expected behavior, at the expense of creating a massive newbie trap.
  4. Create a new type of label that can be lexically scoped and redefined, and only allow purging that label. (A "new type" of label can be created simply by adding a special character at the beginning or the end of the declaration: something like ?foo would be enough.); this new label type wouldn't interact at all with regular labels (i.e., no new scope for local labels, no name collision, etc.). In fact, since it's a new language element, it would be perfectly reasonable to define new semantics for it — e.g., that it can be redefined directly without purging it explicitly. This would probably solve all problems, but at the cost of creating a new language element.

My view:

I don't think solution 2 is a good idea; either of 3 or 4 would work, and ultimately 1 wouldn't be too terrible. I think option 4 is the best in the long run, but it would probably also require the largest amount of work. A simple "1 for now, 4 later if needed" approach would be OK in my view.

ISSOtm commented 4 years ago

Option 4 sounds a lot like #342, no?

aaaaaa123456789 commented 4 years ago

Option 4 is basically #342 if you automatically purge all labels declared in a file when you reach the end of that physical file.

ISSOtm commented 4 years ago

Although there was an idea to extend the idea to more scopes than just file scope, but the general idea is there: option 1 for now, and option 4 will be implemented by #342.

ISSOtm commented 4 years ago

Closing this in favor of #342.