Check that attributes are placed correctly on modules vs sentences

radumereuta commented 1 year ago

We could consider splitting the attributes into <modules> | <rule> | <syntax> ... Some of them can be in multiple categories. We could also check if attributes require a value or are allowed to have values. This right now is handled by the step which uses the attribute.

gtrepta commented 1 year ago

Here's a list of BuiltIn attributes found for each Sentence type that had attributes on them when running mvn verify.

Context	Module	Production	Rule	SyntaxSort
avoid	concrete	alias	alias	cellCollection
color	haskell	applyPriority	anywhere	element
colors	kore	assoc	avoid	hook
format	not-lr1	avoid	color	locations
hybrid	private	binder	colors	token
klabel	symbolic	bracket	concrete	unit
latex	cell	context
left	cellCollection	cool
non-assoc	cellName	format
prefer	color	heat
right	colors	hybrid
seqstrict	comm	initializer
strict	element	klabel
exit	label
format	latex
freshGenerator	left
function	macro
hook	non-assoc
hybrid	non-executable
idem	owise
impure	prefer
index	priority
initializer	result
injective	right
internal	seqstrict
klabel	simplification
latex	smt-lemma
left	stream
macro	strict
maincell	structural
mlBinder	symbolic
mlOp	unboundVariables
multiplicity
no-evaluators
noThread
non-assoc
parser
prec
prefer
private
public
returnsUnit
right
seqstrict
smt-hook
smtlib
stream
strict
symbol
token
topcell
total
type
unit
unparseAvoid
unused
wrapElement

gtrepta commented 1 year ago

Here are Internal attributes (minus Source/Location, which appear on all of them).

Context	Module	Production	Rule
userList	digest	bracketLabel	UNIQUE_ID
cellFragment	cool-like
cellOptAbsent	org.kframework.definition.Production
org.kframework.definition.Production	projection
predicate	userList
projection
userList

dwightguth commented 1 year ago

Don't worry about checking internal attributes; if they're used incorrectly it's a bug, we don't need to explicitly check for that.

Okay, here's what I have:

Attribute	Types	Value (Yes/No/Optional)
alias	rule or production	N
alias-rec	rule or production	N
all-path	claim	N
anywhere	rule	N
applyPriority	production	Y
assoc	production	N
avoid	production	N
bag	production	N
binder	production	O
bracket	production	N
cell	production	N
cellCollection	production	N
cellName	production	Y
circularity	???
color	production	Y
colors	production	Y
comm	production	N
concrete	production or rule or module	O
constructor	production	N
context	context alias	Y
cool	rule	N
depends	???
element	production	Y
exit	production	N
format	production	Y
freshGenerator	production	N
function	production	N
functional	production	N
group	any sentence (not module)	Y
haskell	module	N
heat	rule	N
hybrid	production	O
hook	production or sort	Y
idem	production	N
impure	production	N
index	production	Y
initial	production	N
initializer	rule	N
injective	production	N
internal	production	N
kast	module	N
klabel	production	Y
kore	rule or claim or module	N
label	any sentence (not module)	Y
latex	production	Y
left	production	N
lemma	???
locations	sort	N
macro	rule or production	N
macro-rec	rule or production	N
maincell	production	N
memo	production	N
mlBinder	production	N
mlOp	???
multiplicity	production	Y
no-evaluators	production	N
noThread	???
non-assoc	production	N
non-executable	rule	N
not-lr1	module	N
one-path	claim	N
owise	rule	N
parser	production	Y
prec	production	Y
prefer	production	N
preserves-definedness	rule	N
priority	rule	Y
private	module or production	N
public	module or rule	N
result	context alias or context or rule	Y
returnsUnit	production	N
right	production	N
seqstrict	production	O
simplification	rule	N
smtlib	production	Y
smt-hook	production	Y
smt-lemma	rule	N
stream	rule	O
strict	production	O
structural	???
symbol	production	N
symbolic	production or rule or module	O
tag	rule	Y
token	production	N
topcell	???
total	production	N
trusted	claim	N
type	production	Y
unblock	???
unboundVariables	rule or claim	Y
unit	production	Y
unparseAvoid	production	N
unused	production	N
wrapElement	production	Y

This does not precisely agree with the data you have above; I definitely note you mentioned some attributes are being used some places that I wouldn't expect them to be used. I'd like us to start with this data set and report what individual problems we have and decide on a case by case basis whether to change the implementation or the k definition.

Btw, the ones I marked as ??? appear, to the best of my knowledge, to either be dead code or ought to be replaced with groups or rule labels. It might be worth trying to eliminate them entirely as a precursor PR.

gtrepta commented 1 year ago

Here are the observed attributes and sentence types that don't belong together based on @dwightguth's list (I assumed Dwight's mention of sort corresponds with SyntaxSort).

Context	Production	Rule	SyntaxSort
avoid	initializer	avoid	cellCollection
color	mlOp (???)	color	element
colors	noThread (???)	colors	token
format	public	context	unit
hybrid	stream	format
klabel	topcell (???)	hybrid
latex	klabel
left	latex
non-assoc	left
prefer	non-assoc
right	prefer
seqstrict	right
strict	seqstrict
strict
structural (???)

dwightguth commented 1 year ago

Public can be on a production; that's my error. So can stream. I believe that cellCollection being on sorts might be right also; my memory of that attribute is probably incorrect. Which means it probably doesn't belong on productions.

The rest of these definitely seem like suspicious combinations. How did you measure whether a rule or context has an attribute? Some rules and contexts inherit the attributes of a production during compilation, but that doesn't mean that that attribute can be written by a user or that it means anything. Are you sure you collected data from before the compilation pipeline?

dwightguth commented 1 year ago

Bear in mind that mlOp, noThread, and topcell are definitely placed on definitions in practice, but they still should probably be removed from existence. noThread is specific to the ocaml backend that no longer exists and has no meaning; topcell was created with a purpose and never used for that purpose; mlOp should probably be a group.

gtrepta commented 1 year ago

Are you sure you collected data from before the compilation pipeline?

I placed it at the end of the pipeline, thinking we want to check attributes added by the transformations. Here's an updated table from just before the kore backend pipeline:

Production	Rule	SyntaxSort
cellCollection	comm	cellCollection
initializer	klabel	element
mlOp	structural	token
noThread	unblock	unit
topcell

I'm including cellCollection since there's still some uncertainty there.

Some rules and contexts inherit the attributes of a production during compilation, but that doesn't mean that that attribute can be written by a user or that it means anything.

So I just want to make sure I understand the intent here clearly. These checks are meant for attributes written by the user in K definitions only? I guess that makes more sense. I was thinking that checking attributes generated by the pipeline could aid developers as well, but admittedly that would make a lot of extra work here for probably only a marginal gain.

gtrepta commented 1 year ago

The left/right attributes should forbid any values, but currently they are being re-used to pass values to the backends. Some new internal attributes should be made to serve the purpose that left/right are being used for.

Here's where left/right are being made with values: https://github.com/runtimeverification/k/blob/a863a1e4618067b7027a12bcdbd2d1fca3ba8273/kernel/src/main/java/org/kframework/backend/kore/ModuleToKORE.java#L1606-L1607

This was introduced by #1287

Since this change will not be straightforward (It will require changes to the llvm-backend), I'm going to merge my PR #3579 with values for those attributes left as optional, and it can be addressed in a separate set of PRs.

gtrepta commented 1 year ago

mlOp and structural still haven't been addressed.

Dwight was suggesting that mlOp should probably be a group attribute. A lot of ML predicate productions use this, so they would need to use function instead (and total?)

structural appears to be dead, but it's mentioned extensively in the k tutorials so there would be a lot of cleanup to do there.

gtrepta commented 1 year ago

I've found a breakage in the c-semantics with these new changes.

[Error] Compiler: ContextAlias sentences can not have the following attributes:
[unboundVariables]
        Source(/home/gtrepta/runtimeverification/c-semantics/core/semantics/language/constant-exp.k)
        Location(154,29,155,65)
            .                               v~~~~~~~~~~~~~~
        154 |     context alias [constant]: elaborate(HERE)
        155 |       requires isResolved(HOLE) andBool isIntegerConstantExp(HOLE)
[result(RValResult), unboundVariables(HOLE)]
            .       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
[Error] Compiler: Had 1 structural errors.

I can open a PR to add Context and ContextAlias to unboundVariables allowed sentences. But, this raises a couple of points:

Should we make these errors into deprecation warnings to prevent a complete blockage that requires attention should they come up elsewhere?
Productions, Contexts, and ContextAliases can desugar into Rules, so some more thought needs to be put towards which of these rule-only attributes can be allowed on other sentences. For example, cool is rule-only and probably should stay that way because putting it on a strict production or a context wouldn't make any sense, but as we saw, unboundVariables could allow more than just rules.

dwightguth commented 1 year ago

I tried to address these but obviously this slipped through. Anything that might apply to top-level rewrite rules that might apply to a heating or cooling rule needs to be applicable to contexts and context aliases. Go ahead and update the PR. I don't think we need to make these into warnings though; it should be fairly quick to address any further breakages we find. It might not be a bad idea to test more semantics proactively though.

gtrepta commented 1 year ago

I think this is done.

runtimeverification / k

Check that attributes are placed correctly on modules vs sentences #3481