Closed staerz closed 2 years ago
Hey @staerz,
Would the rules for concurrent_009
and concurrent_011
apply here?
Such that signal_016
becomes equivalent to concurrent_011
and then add a new rule to handle the alignment?
Hi @jeremiah-c-leary,
maybe it is not 100% correct to say that these rule would apply there, but I think it is very similar as the concept is the same.
I mean this extended rule for signal declarations is for the case that we have a more complex data type, so it also goes a little along type-012 etc.
Most importantly, this and #549 aught to have the same meaning: the one for signals, the other for variables. And in both cases, I think declaration and assignment should be treated, and at best, very similarly (and most consistently).
It really seems like signal_016
should be depricated.
I see your point about a type_012
equivalent.
Let me come up with some test cases we can review before I start changing the rule.
Hi @jeremiah-c-leary,
you'd find some examples in the fpga_if.vhd
provided here. You'd have to change the variable assignment into a signal assignment.
If you want, I can try to identify other examples in our code base.
Hey @staerz and @sibeov,
I pushed an update for this issue to the issue-539 branch. I still have some refactoring to do, but I was hoping you would take some time and check out the updates and see if they meet your needs.
Thanks,
--Jeremy
Hi @jeremiah-c-leary,
alright, I tested it and it looks good, although I found sth. which I think we haven't considered yet: Upon declaration of a signal, if it is a vector type, one may have a range indication which I find more appropriate to have on the line of the signal declaration itself:
signal felix_packets : t_avst_packets(minimum(3, G_TILES) - 1 downto 0)
(
data(G_FELIX_TXD_W - 1 downto 0),
empty(G_FELIX_TXE_W - 1 downto 0),
error(G_FELIX_TXX_W - 1 downto 0)
);
Instead, what VSG currently (with this patch) want me to do is this:
signal felix_packets : t_avst_packets
(
minimum(3, G_TILES) - 1 downto 0
)
(
data(G_FELIX_TXD_W - 1 downto 0),
empty(G_FELIX_TXE_W - 1 downto 0),
error(G_FELIX_TXX_W - 1 downto 0)
);
Note that I use this configuration:
"signal_016":{
"first_paren_new_line":"yes",
"last_paren_new_line":"yes",
"open_paren_new_line":"yes",
"close_paren_new_line":"yes",
"new_line_after_comma":"ignore_positional",
"assign_on_single_line":"ignore"
},
Note that this is perfectly fine for signals where the size of the outer structure is already defined, i.e. has no outer dimension:
signal xgbe_packet_lolli : t_avst_packet
(
data(G_XGBE_TXD_W - 1 downto 0),
empty(G_XGBE_TXE_W - 1 downto 0),
error(G_XGBE_TXX_W - 1 downto 0)
);
Now note that here we just have the cases of: No dimension (VSG handles properly) and 1 dimension (VSG doesn't yet handle it fully), but in general one might have any number of dimensions of that structure. I think now the question is how we would want to handle that. The general syntax for the dimensions would be additional pairs of parenthesis ("(...)"), one for each dimension, before coming to the inner structure of the record (for which VSG is now properly prepared).
I could image 3 possible configurations:
Maybe that should call in for yet another configuration parameter, what do you think?
Note that this comment actually also applies to constants (see config of constant_016
), so I think if we add it here, we add it there as well (and by design would also apply to variables).
In general I think we're a very good step ahead, but thanks to my code base (or due to it, you never know :wink:), we still find room for improvement.
@jeremiah-c-leary I will take a look as soon as time allows for it! Thanks for the update and notification. This answer is based on not fully reading @staerz s answer. š
Hey @staerz,
I could image 3 possible configurations:
- Have the indication of (all) the dimension(s) in a single line, i.e. the line of the "signal" declaration, or
- following the current rule of 1 dimension per line, or
- (all) the dimension(s) in a new line, i.e. the line after the "signal" declaration, but before the indication of the internal structure.
Maybe that should call in for yet another configuration parameter, what do you think?
What if we had the following two configuration options: first_dimension_new_line
and dimension_new_line
. The following examples are ignoring alignment as that is handled by another rule. However, I believe the indenting would just follow the existing rules.
If first_dimension_new_line
= no
and dimension_new_line
= no
and first_paren_new_line
= no
the following would be enforced:
signal felix_packets : t_avst_packets(first_dimension)(second_dimension)(inner_structure);
If first_dimension_new_line
= yes
and dimension_new_line
= no
and first_paren_new_line
= no
the following would be enforced:
signal felix_packets : t_avst_packets
(first_dimension)(second_dimension)(inner_structure);
If first_dimension_new_line
= no
and dimension_new_line
= yes
and first_paren_new_line
= no
the following would be enforced:
signal felix_packets : t_avst_packets(first_dimension)
(second_dimension)(inner_structure);
If first_dimension_new_line
= no
and dimension_new_line
= no
and first_paren_new_line
= yes
the following would be enforced:
signal felix_packets : t_avst_packets(first_dimension)(second_dimension)
(inner_structure);
If first_dimension_new_line
= yes
and dimension_new_line
= no
and first_paren_new_line
= yes
the following would be enforced:
signal felix_packets : t_avst_packets
(first_dimension)(second_dimension)
(inner_structure);
If first_dimension_new_line
= yes
and dimension_new_line
= yes
and first_paren_new_line
= yes
the following would be enforced:
signal felix_packets : t_avst_packets
(first_dimension)
(second_dimension)
(inner_structure);
I believe we can address all three of your scenarios with those two configuration options.
Note that this comment actually also applies to constants (see config of constant_016), so I think if we add it here, we add it there as well (and by design would also apply to variables).
I agree, once we get this working it should apply to constant and variable declarations also.
In general I think we're a very good step ahead, but thanks to my code base (or due to it, you never know š), we still find room for improvement.
Yes, your code base has given me the opportunity to improve my code base. :grin:
--Jeremy
Hi @jeremiah-c-leary,
alright, that looks good to me, but please allow me to lay down my understanding of your proposal:
first_paren_new_line
affects the first parenthesis of the inner structure, i.e. the first opening parenthesis belonging to the last closing one and indicates if that (sort of the) inner structure should or should not be brought to the new linefirst_dimension_new_line
indicates if the first dimension should or should not be brought to the new linedimension_new_line
indicates if any following dimension(s) should or should not be brought to the new lineThat sounds like a good start although I would propose calling the last option other_dimensions_new_line
.
Further, I'm wondering how these parameters would affect the parenthesis of the dimensions:
"first_paren_new_line":"yes",
"last_paren_new_line":"yes",
"open_paren_new_line":"yes",
"close_paren_new_line":"yes",
I understand that by first_dimension_new_line
and dimension_new_line
(or other_dimensions_new_line
) we would force the next opening parenthesis (which belongs to the next element) to the new line, but would we also want a configuration to allow this
signal felix_packets : t_avst_packets
(first_dimension)
(second_dimension)
(inner_structure);
and that:
signal felix_packets : t_avst_packets
(
first_dimension
)
(
second_dimension
)
(inner_structure);
i.e. to separate the actual indication of the dimension from the parenthesis as it is possible with the first, last, open, close-paren_new_line
settings.
Not that I'd be in favour of that or I'd ever use it, but just to be complete.
And another thing, just for consistency, we should make sure that things are self-consistent, i.e. in the following code, note that dimensions actually appear on 2 different levels:
signal felix_packets : t_avst_packets
(
minimum(3, G_TILES) - 1 downto 0
)
(
data(G_FELIX_TXD_W - 1 downto 0),
empty(G_FELIX_TXE_W - 1 downto 0),
error(G_FELIX_TXX_W - 1 downto 0)
);
(as reported VSG would want it currently) should (under the premises that we'd want the parenthesis of the dimensions stand alone on single lines) actually be formatted as
signal felix_packets : t_avst_packets
(
minimum(3, G_TILES) - 1 downto 0
)
(
data
(
G_FELIX_TXD_W - 1 downto 0
),
empty
(
G_FELIX_TXE_W - 1 downto 0
),
error
(
G_FELIX_TXX_W - 1 downto 0
)
);
as there's a dimension indication within the internal structure!
So I think VSG should be strong enough to apply the rules from the outside to the inside.
Maybe in that respect / from that angle we should review the first, last, open, close-paren_new_line
settings - I'll have to study the doc again, but I'd say what counts for opening counts for closing, so it seems a bit redundant here, but I may be off a bit now (by all this dimension thinking...).
So looks like a hard nut to crack, but I'm convinced we'll get there.
Hey @staerz,
alright, that looks good to me, but please allow me to lay down my understanding of your proposal:
- first_paren_new_line affects the first parenthesis of the inner structure, i.e. the first opening parenthesis belonging to the last closing one and indicates if that (sort of the) inner structure should or should not be brought to the new line
- first_dimension_new_line indicates if the first dimension should or should not be brought to the new line
- dimension_new_line indicates if any following dimension(s) should or should not be brought to the new line
Your understanding is matching my proposal.
That sounds like a good start although I would propose calling the last option other_dimensions_new_line.
I originally had successive_dimensions_new_line
, but I like your suggestion.
Further, I'm wondering how these parameters would affect the parenthesis of the dimensions:
I was thinking those parameters would only affect the inner structure. If we wanted to be consistent with the inner structure then I would propose the following parameters:
"first_dimension_new_line":"no",
"other_dimension_new_line":"no",
"dimension_open_paren_new_line":"no",
"dimension_close_paren_new_line":"no"
And another thing, just for consistency, we should make sure that things are self-consistent, i.e. in the following code, note that dimensions actually appear on 2 different levels:
Are data
, empty
and error
dimensions also?
Just to clarify something. In this example...
signal felix_packets : t_avst_packets(minimum(3, G_TILES) - 1 downto 0)
(
data(G_FELIX_TXD_W - 1 downto 0),
empty(G_FELIX_TXE_W - 1 downto 0),
error(G_FELIX_TXX_W - 1 downto 0)
);
is this the underlying type for t_avst_packets
?
type t_avst_packet is record
data : std_logic_vector;
valid : std_logic;
sop : std_logic;
eop : std_logic;
empty : std_logic_vector;
error : std_logic_vector;
end record;
I am having trouble understanding how the dimensioning works with t_avst_packets
being a record.
--Jeremy
Hi @jeremiah-c-leary,
I was thinking those parameters would only affect the inner structure.
OK, that's good for me, I just wanted to clarify if I fully understood your proposal.
Why would we need dimension_open_paren_new_line
and dimension_close_paren_new_line
? Would these set if after the opening parenthesis and before the closing parenthesis there's a newline?
I would say that this is too much, but since that setting exists for the inner structure I guess you're right, it should exist as well. One argument against it could be that it should possibly taking the setting from the inner structure, but on the other hand some people (maybe even me) would like to configure dimensions and structure differently (why ever) ... I think I'd have to have a look how it'd look like. (The question is: Would one setting for all be enough and fully consistent or would that look bad?)
Yeah, to complete the example, we have:
type t_avst_packet is record
data : std_logic_vector;
valid : std_logic;
sop : std_logic;
eop : std_logic;
empty : std_logic_vector;
error : std_logic_vector;
end record;
type t_avst_packets is array (natural range <>) of t_avst_packet;
as type definitions (and similar ones) and then as usable code
signal felix_packets : t_avst_packets(minimum(3, G_TILES) - 1 downto 0)
(
data(G_FELIX_TXD_W - 1 downto 0),
empty(G_FELIX_TXE_W - 1 downto 0),
error(G_FELIX_TXX_W - 1 downto 0)
);
such that:
minimum(3, G_TILES) - 1 downto 0
is the dimension of the record arrayG_FELIX_TXD_W - 1 downto 0
is the dimension of data
G_FELIX_TXE_W - 1 downto 0
is the dimension of empty
G_FELIX_TXX_W - 1 downto 0
is the dimension of error
Note that this is only possible since VHDL-2008. In previous versions the type t_avst_packet
would require to have the dimensions for all its elements defined at type declaration.
So if you consider the range of a std_logic_vector
a dimension, then yes, that is a dimension. But note that this is only an example and the record could contain any further substructure (a record could contain another record) and with VHDL-2008, dimensions when ever occurring for any of these structured elements would have to be given upon signal declaration (so they could appear at any level) [if they are not yet defined in the type directly].
I hope that gives you the full picture. Now all we have to do is to put this together :rofl:
Hey @staerz,
I am thinking it might be helpful to rename the elements to something like this:
signal felix_packets : t_avst_packets(first_outer_dimension)
(
(first_inner_dimension),
(second_inner_dimension),
(third_inner_dimension)
);
Where the *_outer_dimension
refers to range portion of t_avst_packets
...
type t_avst_packets is array (natural range <>) of t_avst_packet;
...and *_inner_dimension
refers to the elements of the t_avst_packet
record itself.
type t_avst_packet is record
data : std_logic_vector;
valid : std_logic;
sop : std_logic;
eop : std_logic;
empty : std_logic_vector;
error : std_logic_vector;
end record;
So all of the existing parameters:
"first_paren_new_line": "no",
"last_paren_new_line": "no",
"open_paren_new_line": "no",
"close_paren_new_line": "no",
"new_line_after_comma": "no",
could be renamed as:
"inner_dimension_first_paren_new_line": "no",
"inner_dimension_last_paren_new_line": "no",
"inner_dimension_open_paren_new_line": "no",
"inner_dimension_close_paren_new_line": "no",
"inner_dimension_new_line_after_comma": "no",
...and the parameters for the outer dimensions would be:
"first_outer_dimension_new_line":"no",
"other_outer_dimension_new_line":"no",
"outer_dimension_open_paren_new_line":"no",
"outer_dimension_close_paren_new_line":"no"
That would allow you to adjust the formatting based on the type of dimension.
Why would we need dimension_open_paren_new_line and dimension_close_paren_new_line? Would these set if after the opening parenthesis and before the closing parenthesis there's a newline?
I am not sure why someone would want to as it looks excessively noisy to me. However, if we wanted to be consistent then we would implement those rules.
The outer_dimension_open_paren_new_line
would insert a carriage return after the open parenthesis of a dimension:
From this:
signal felix_packets : t_avst_packets(first_outer_dimension)
(
(first_inner_dimension),
(second_inner_dimension),
(third_inner_dimension)
);
...to this...
signal felix_packets : t_avst_packets(
first_outer_dimension)
(
(first_inner_dimension),
(second_inner_dimension),
(third_inner_dimension)
);
The outer_dimension_open_paren_new_line
would insert a carriage return before the close parenthesis of an outer dimension:
From this:
signal felix_packets : t_avst_packets(first_outer_dimension)
(
(first_inner_dimension),
(second_inner_dimension),
(third_inner_dimension)
);
...to this...
signal felix_packets : t_avst_packets(first_outer_dimension
)
(
(first_inner_dimension),
(second_inner_dimension),
(third_inner_dimension)
);
But note that this is only an example and the record could contain any further substructure (a record could contain another record) and with VHDL-2008, dimensions when ever occurring for any of these structured elements would have to be given upon signal declaration (so they could appear at any level) [if they are not yet defined in the type directly].
I had not thought of nested records with respect to this rule. Would you happen to have any examples?
--Jeremy
Hi @jeremiah-c-leary,
let's take a step back: Given that structures can be nested in VHDL (like other languages) to any depth in any arbitrary way that is most suitable for the needs of the targeted development, let's not fall into terms of "inner" and "outer" as that gives the impression that only 2 levels would exist.
Instead, what we clearly see in terms of structure is record types that define a container with certain elements of any type and types (which include records) that can have a dimension, i.e. arrays (of records), std_logic_vectors, unsigned, and such.
If I'm not overlooking anything, these are the only 2 categories. So let's find a solution for that such that it fits what ever structural combination that you might find all those elements come together. I mean the VHDL reference is clear about that, so I think should VSG be able to follow that.
If you agree to that above then I think we should put back things on the table (i.e. existing configurations for structures) and see how we can come up with some consistent scheme (and consistent naming for all these configuration parameters) that will cover all these possibilities.
And no, out of my code base I don't have an example of a nested record, I'd have to go steel an example from a collaborator's project, but I guess you can easily imagine one.
Hey @staerz,
I apologize, I was distracted with fixing some errors.
Instead, what we clearly see in terms of structure is record types that define a container with certain elements of any type and types (which include records) that can have a dimension, i.e. arrays (of records), std_logic_vectors, unsigned, and such.
I believe this example matches:
type t_a is record
a : std_logic_vector;
b : std_logic;
end record;
type t_a_array is array (natural range <>) of t_a;
type t_b is record
x : std_logic;
y : std_logic_vector;
z : t_a_array;
end record
type t_b_array is array (natural range <>) of t_b;
such that when we create the signal and constrain the std_logic_vectors we could have something like this:
signal my_sig : t_b_array(3 dowto 0)
(
y(7 downto 0),
z(2 downto 0)
(
a(31 downto 0)
)
);
Where my_sig is an array of records and each record in the array includes an array of records.
--Jeremy
Hi @jeremiah-c-leary,
no need to apologise, we do things as we can.
Yes, that is a very simple example which summarises our problem to solve. And yes, I would in general set the parenthesis like you.
One aspect we shouldn't forget about is nested arrays for which also two (or more) range indications could follow, and also true multi dimensional arrays, please see this post.
Hey @staerz,
I pushed an intermediate commit to the issue-539 branch. I created a new option array_first_paren_new_line
which will insert/remove returns between the array identifier and the open parenthesis in the declaration. It looks like it is working for nested arrays also.
The behavior of first_paren_new_line
is changed slightly. It now applies to the first open parenthesis after every array entry.
My test examples use rule_016_test_input.nested_arrays.vhd
as the source file. My tests that fix the file based on parameters are using the files of the variation rule_016_test_input.nested_arrays.fixed_array_first_paren_line_line_[yes|no]_first_paren_new_line_[yes|no].vhd
and rule_016_test_input.nested_arrays.fixed_array_first_paren_new_line_[true|false].vhd
.
I still need to work through the variations with the existing options, however I would appreciate some feedback on the source against the fixed files. Some of the question I have are: 1) Am I heading in the right direction 2) What changes to the source file would you make to improve the robustness of the solution. 3) Are there any missing options
Thanks,
--Jeremy
Hi @jeremiah-c-leary,
thanks for coming back on this.
I had a look and would first of all propose to change the name of the configuration parameters.
Given the structure that we have identified, I find it highly confusing to refer to sth. "first_paren" when this first parenthesis might be the one of a range indication or the one of a record construct. I would say that both would/could have different implications of the intended formatting.
One thing during my tests that I realised: Using Sublime as editor, it actually switches syntax highlighting dependent on where the parenthesis is:
signal felix_packets : t_avst_packets(minimum(3, G_TILES) - 1 downto 0)
(
data(G_FELIX_TXD_W - 1 downto 0),
empty(G_FELIX_TXE_W - 1 downto 0),
error(G_FELIX_TXX_W - 1 downto 0)
);
makes the t_avst_packets
highlighted while as soon as I separate the following parenthesis to a new line, it becomes regular text.
Now that might just be a flaw of the VHDL highlighting plugin that I use, but it might also give us some good indication of what other people's practice is :wink:
So maybe we could bring in the terms "range", "structure" or "elements" into the configuration names. Here "range" would correspond to what we called "dimension" earlier.
It could just be shorter and maybe it would be more self-explanatory since an unconstrained type definition (which then obliges the indication upon signal declaration in the first place) would have natural range <>
in these parenthesis.
While in the type definition we have the keyword "record" to enclose the elements, maybe then the term "structure" or "elements" would be a more obvious correlation. One could, of course, go as far as to name the configuration sth. "recordopen..." but I'm not sure if we don't shoot ourselves in the legs with that term: Would there be other structures out in VHDL that are not records but still require a pair of parenthesis and would fall within the scope here? (If the answer is a very solid "no, impossible", then I wouldn't be opposed to just call it "record_something")
Now looking into how you organise the test files, I would suggest to shorten the naming as well: Instead of writing out the full name of the config each time, I think it would be more advisable to just have the actual setting in the name (so "yes" or "no"), but not the name of the setting itself. This does though require you to be rigorous on the order but I think that's easily achieved. So one file could just be called "rule_016_test_input.nested_arrays.fixed_array_yes_yes[_yes...].vhd (with as many "_yes" as configuration parameters there are. Might be 4 or 5 if I don't overlook anything). That would also allow you to really cover everything as your tests would basically be a power of 2 and easily loopable (over all parameters yes/no setting).
I think once we both are clear on the terminology (which is related to the overall structure and goal here), I think we can then go ahead. The next thing could be to just re-discuss which are settings that are needed. It might be that currently there is more than needed, but let's first agree on the structure/terminology and I'm convinced that this would help us to see that (just renaming can sometimes be very eye-opening).
And then we discuss about the implementation :wink:
Hey @staerz,
I mapped the signal_declaration out based on the LRM to see what is involved:
According to the LRM, we are applying constraints to the type. So I think it would make sense to include constraint
in the option paramenters. Interestingly any range constraint is actually an array_constraint
.
For example in the following code:
everything in red boxes would be an array_constriant
.
Everything in blue boxes are the open and close parenthesis in the record_constraint
.
Everything in green boxes are commas in the record_constraint
.
Everything in yellow boxes are record_element_simple_name
s:
So we could have a set of parameters like this:
rule:
signal_016:
array_constraint_single_line: 'yes'
array_constraint_open_paren_new_line: 'no'
record_constraint_open_paren_new_line: 'yes'
record_constraint_close_paren_new_line: 'yes'
record_constraint_comma_new_line: 'no'
record_constraint_element_new_line: 'yes'
I believe those options would enforce the following format in the code above:
What do you think?
--Jeremy
Hi @jeremiah-c-leary,
thanks for digging into the LRM, that is really useful!
Note that this example considers a bare record as the outermost structure, but one might also have an array of records.
That wouldn't change much w. r. t. the definition of array_constraint
and record_constraint
, but just to mention in order not to forget it.
So if I stick to your chosen example, we talk about
signal sig8 : record_type_3(1 downto 0)(3 downto 0)
(
...
);
So there is potential array_constraints
there.
With that regard, I fully subscibe to your analysis of identified elements and possible points of configuration.
Now let's go 1 by 1:
1) array_constraint_single_line: If set to 'yes', would you ensure that all pairs of parentesis that indicate all (potential) dimensions of the array are in 1 single line? I.e.
signal sig8 : record_type_3(1 downto 0)(3 downto 0)
Given that there's a second configuration for array constraints, I wonder what the 4 possible combinations (for this I'd say generally representing example) would look like.
I understand that we talk about the formating of the "red" part only at that moment.
2) array_constraint_open_paren_new_line: I could imagine you have in mind that an option 'yes' could lead to formating like this I.e.
signal sig8 : record_type_3
(1 downto 0)
(3 downto 0)
but I wonder in how far that wouldn't contradict array_constraint_single_line: 'yes' as now it's not all in 1 line. Having the 4 combinations laid out would help.
(at that point I'm tempted to say that one configuration option should be enough (which makes either my example 1) or the example 2) the rule. I can barely imagine that anybody would want to strip off the parenthesis from the range and spend extra lines on that. But let me know what you have in mind, I might miss something.)
3) record_constraint_open_paren_new_line: 'yes' would put the opening parenthesis into the new line, while 'no' would put it into the previous line together with the element name? In case of a range indication of the element, it would (in case of 'no') be behind the last closing parenthesis of the range indication, correct? [so this option sets or doesn't set the line break in front of the opening parenthesis?]
Fully OK with that!
4) record_constraint_close_paren_new_line: 'yes', similar to 3), except we control the line break in front of the closing parenthesis.
That's alright.
5) record_constraint_comma_new_line: 'no'. Not sure to fully understand this. I would think it decides if after a comma a line break is inserted, but your example has 'no' but line breaks.
6) record_constraint_element_new_line: 'yes' would insert a new line after the opening parenthesis of a record structure. (elements that follow other elements in a record are already covered by config 5) as they would be separated by comma)
The following picture tries to summarise my understanding of the record constraint settings:
where I assume all record constrain parameters on, i.e. inserting a new line:
I had hoped we could reduce the number of configurations for the record constraints, but by exercising this example I convinced myself of the fact that this is the exact correct amount.
Now for completeness I'd like to better understand your two options on array constraints.
Hey @staerz,
I like how you showed all the options on a single diagram. Let me update it with all the options:
option | symbol | options | actions |
---|---|---|---|
record_constraint_open_paren_new_line | green diamond | yes, no, ignore | insert, remove, allow carriage return before open paren |
record_constraint_close_paren_new_line | red penta-star | yes, no, ignore | insert, remove, allow carriage return before close paren |
record_constraint_comma_new_line | purple hexa-star | yes, no, ignore | insert, remove, allow carriage return before comma |
record_constraint_element_new_line | orange triange | yes, no, ignore | insert, remove, allow carriage return before element |
array_constraint_single_line | red box | yes, ignore | combine array_constraint into single line, allow array_constraint to span lines |
array_constraint_new_line | blue hexgon | yes, no, ignore | insert, remove, allow carriage return before open paren |
record_constraint_comma_new_line: 'no'. Not sure to fully understand this. I would think it decides if after a comma a line break is inserted, but your example has 'no' but line breaks.
All of the *_new_line
options will insert, remove or allow carriage returns before their respective item. So for the record_constraint_comma_new_line
option, the comma can be moved to a new line or moved to the previous line.
record_constraint_element_new_line: 'yes' would insert a new line after the opening parenthesis of a record structure. (elements that follow other elements in a record are already covered by config 5) as they would be separated by comma)
As with the _comma_
option, the record_constraint_element_new_line
would handle carriage returns before elements
The array_constraint_single_line
option would enforce array_constraints are on a single line. For example, if array_constraint_single_line
is yes
then
element2(4
downto
0
)(
7
downto
0
)
would be fixed to:
element2(4 downto 0)(7 downto 0)
If both array_constraint_single_line
and array_constraint_new_line
(maybe it should be array_contraint_open_paren
?) are set to yes
then
element2(4
downto
0
)(
7
downto
0
)
would be fixed to:
element2
(4 downto 0)
(7 downto 0)
If array_constraint_single_line
is ignore
and array_constraint_new_line
is yes
then
element2(4
downto
0
)(
7
downto
0
)
would be fixed to:
element2
(4
downto
0
)
(
7
downto
0
)
I had hoped we could reduce the number of configurations for the record constraints, but by exercising this example I convinced myself of the fact that this is the exact correct amount.
I agree. I do not believe you should have less than those 6 rules. Annotating the code snippet with those symbols really makes it obvious. We will have to keep that in mind for future issues with multiple options.
If we agree on the option names and their functions, I will first update/create a configuration page and have you review it before I start implementing. I want to say it will not be that hard, but I know better.
Regards,
--Jeremy
Hi @jeremiah-c-leary,
almost all good for the record constraint configuration options.
Only for record_constraint_comma_new_line
I have the question if you really foresee to make this control the new line to be inserted before the comma. I remind you that the current configuration is called new_line_after_comma
, so after the comma.
I cannot imagine any consistent linting where the comma would be put to a new line as a stand-alone character. Just like punctuation in semantics, I would always make that the last character of the sentence (with no space before it).
Now as I type that reply I think maybe there could be a check that there really is no space (in the spots of the purple hexa-star of your picture), but that's it.
Also I noticed that in your picture you consistently place the markers before the items to which the rule corresponds. I had tried to always put them where the actual line break character is (within the precision of a very quick and dirty graphics attempt). Both seem valid.
In my picture though, given that I interpret the rule on the comma as after, I think it makes the application of the following element rule redundant. I write that as I try to avoid to have redundant rules which could possibly result in self-contradicting configuration options.
(So "the record_constraint_element_new_line
would handle carriage returns before elements" except after comma (as record_constraint_comma_new_line
already treats that))
I would encourage you to consider that comma configuration...
For the array constraints: Do you honestly see any merit in spreading the range indication over multiple lines? I would always argue that (from your example), the "4 downto 0" and "7 downto 0" would always be together. What is a matter of style is where and how the parenthesis go.
Here are the options that I could imagine useful:
element(
4 downto 0
)(
7 downto 0
)
element(
4 downto 0
)
(
7 downto 0
)
element(4 downto 0)(7 downto 0)
element
(4 downto 0)
(7 downto 0)
The latter actually introduces an inconsistency on the parenthesis, but who knows, somebody might still like that style.
So I would actually prefer just one configuration which itself than has possible values that each refer to one of the styles above, plus one "ignore".
We though have to be careful: the actual start and end of the range might come from functions that take arguments and hence introduce another set of parenthesis. I would say depending on which style (of the above) one uses, it might become quickly inconsistent with setting parenthesis for functions. In order to prevent that, I think VSG could give the recommendation to not use the functions in the signal declaration on ranges, but introduce local constants instead and then use the constant in the range indication.
With those 4 "meaningful" options, I would just call the configuration array_constraint_format
and give some indicative names for the possible values. These could be:
separated_parenthesis
extra_separated_parenthesis
all_in_one_line
one_line_per_dimension
ignore
I think that would simplify things a lot. And if anybody comes around with yet another idea of typesetting, then we could just add another supported format and that's it.
"We will have to keep that in mind for future issues with multiple options.": It might actually be a good idea to bring a furbished version of that into the documentation of that configuration options then. Just like it helps us to identify and structure the problem, so will it help user to get a catch of what they actually configure here.
Hey @staerz,
Only for record_constraint_comma_new_line I have the question if you really foresee to make this control the new line to be inserted before the comma. I remind you that the current configuration is called new_line_after_comma, so after the comma.
I cannot imagine any consistent linting where the comma would be put to a new line as a stand-alone character. Just like punctuation in semantics, I would always make that the last character of the sentence (with no space before it).
I cannot imagine either, but I wanted to be able to fix a situation like this:
element1(7 downto 0)
, element2(7 downto 0)
which I would assume was not intentional. I could fix it silently, but if I have to put the logic in to move the comma then I might as well add the ability to configure it. It would also advertise to users the issue will be detected and fixed.
I remind you that the current configuration is called new_line_after_comma, so after the comma.
I think after working through the LRM, having the example code, and not referencing the options I had previously I ended up treating all the new lines as before instead of after. I also dropped the first_
and last_
options.
In order to prevent that, I think VSG could give the recommendation to not use the functions in the signal declaration on ranges, but introduce local constants instead and then use the constant in the range indication.
I had been thinking of adding a rule to check for ranges that are not range types to help the code express itself better.
element1(7 downto 0),
element2(15 downto 0)
versus
element1(byte),
element2(word)
You could also abstract away range calculations in a constant as you pointed out.
With those 4 "meaningful" options, I would just call the configuration
array_constraint_format
and give some indicative names for the possible values.
I like your proposal. I would want to keep the number of options small to start out and let users put in an issue if they want a different formatting option.
I do wonder about the one_line_per_dimension
option. I was thinking the array_constraint_format
would work against a single array constraint. However, if you have multiple array constraints in a row it would require a separate option. Probably not ideal. I will have to think about it.
"We will have to keep that in mind for future issues with multiple options.": It might actually be a good idea to bring a furbished version of that into the documentation of that configuration options then. Just like it helps us to identify and structure the problem, so will it help user to get a catch of what they actually configure here.
I agree. It would also be useful to update other configuration documentation.
Regards,
--Jeremy
Hi @jeremiah-c-leary,
On record_constraint_comma_new_line
:
if I have to put the logic in to move the comma then I might as well add the ability to configure it.
OK, so we both identify the need to in some way define where to put that comma: As I already pointed out, I'd actually like a rule that puts it directly after the range indicating closing parenthesis, on the same line. That is very similar to your attempt of trying to catch it if was moved to the next line.
Could we hence re-name and re-define that rule to define the location of that comma simply? It would then have more than just 2 options:
And if we renamed it to record_constraint_comma
?
This should have that story covered then.
And I agree that this would then be complementary to the record_constraint_element_new_line
configuration.
Note that maybe dropping the _new_line
from the other configuration names and moving it to the value would be a good idea, too. That would make in particularly the value no
turn into sth. like same_line
, yes
into new_line
and be hence more descriptive and self-explanatory to a user in general, what do you think?
I like your proposal. I would want to keep the number of options small to start out and let users put in an issue if they want a different formatting option.
:100: what I'm saying.
I do wonder about the one_line_per_dimension option. I was thinking the array_constraint_format would work against a single array constraint.
I'd say no. It's not working against it, it should be tolerant to any number of dimensions. The 4 options I laid out would work also with just 1 dimension, or 3 or more (or any other number):
For 1 dimension:
"separated_parenthesis"
element(
4 downto 0
)
"extra-separated_parenthesis"
element(
4 downto 0
)
doesn't change anything as that would just have an obvious effect for sequences of ")(" which we don't see here for the lack of extra dimensions
"all_in_one_line"
element(4 downto 0)
"one_line_per_dimension"
element
(4 downto 0)
For 3 dimensions:
"separated_parenthesis"
element(
4 downto 0
)(
7 downto 0
)(
9 downto 0
)
"extra-separated_parenthesis"
element(
4 downto 0
)
(
7 downto 0
)
(
9 downto 0
)
doesn't change anything as that would just kick in for sequences of ")("
"all_in_one_line"
element(4 downto 0)(7 downto 0)(9 downto 0)
"one_line_per_dimension"
element
(4 downto 0)
(7 downto 0)
(9 downto 0)
So it's really just the structure that we'd define here, and the dimensions just fill in ...
This would be to throw 1 configuration at it.
If we wanted to do similar to how we do for record constraints, then we'd have to introduce 4 configurations: each before and after of opening and closing parenthesis. So that would require 16 tests if you want to walk yourself through that pain :wink: I'd argue that the above 4 is already quite reasonable.
Hey @staerz,
Could we hence re-name and re-define that rule to define the location of that comma simply? It would then have more than just 2 options:
- ignore: doesn't care for it and doesn't touch anything
- space: leaves one space between it and the closing parenthesis, so requiring ") ,"
- no_space: leaves no space between it, also no line break, so directly requiring "),"
- new_line: for the crazy people who actually want to put it on a new line (indentation would then still need to be defined but > that should just be at the location where the first character of the element name starts I suggest)
And if we renamed it to record_constraint_comma?
I like this approach. I would implement the ignore
and no_space
options. The space
option is handle by another rule which removes spaces before a comma. The last one I would implement if someone asks for it. No use encouraging that style.
Note that maybe dropping the _new_line from the other configuration names and moving it to the value would be a good idea, too. That would make in particularly the value no turn into sth. like same_line, yes into new_line and be hence more descriptive and self-explanatory to a user in general, what do you think?
I this idea also. It gives a lot more flexibility in defining behaviors. So the options would be:
option | symbol | options | actions |
---|---|---|---|
record_constraint_open_paren | green diamond | new_line, previous_line, ignore | insert, remove, allow carriage return before open paren |
record_constraint_close_paren | red penta-star | new_line, previous_line, ignore | insert, remove, allow carriage return before close paren |
record_constraint_comma | purple hexa-star | previous_line, ignore | remove, allow carriage return before comma |
record_constraint_element | orange triange | new_line, previous_line, ignore | insert, remove, allow carriage return before element |
array_constraint | red box | all_in_one_line, one_line_per_dimension, ignore | combine array_constraint into single line, put each dimension on it's own line, allow array_constraint to span lines |
Not so sure on the previous_line
wording. new_line
makes sense though. Any suggestions on a better phrase?
So that would require 16 tests if you want to walk yourself through that pain š I'd argue that the above 4 is already quite reasonable.
I can see the advantages with testing.
--Jeremy
Hi @jeremiah-c-leary,
that looks like we're making good progress here, I like how things start coming together now!
Very agreed for record_constraint_comma
!
No use encouraging that style.
:100:
For the wording of "previous_line", would the option name "attached" be better? Like "attached to the previous element [in the previous line]"
And for the action, I think "allow carriage return before open paren" should simply say "don't care". I.e. that would also include that if there'd be a space (or multiple spaces), VSG would simply ignore it (or other rules as you say might kick in for a few details here...). Let's say that this is at least what I'd expect to happen as a user if I have an "ignore" option.
For the array_constraint
though I see you dropped the options "separated_parenthesis" and "extra-separated_parenthesis". Not sure if I would like to use one of these actually, possibly not as "one_line_per_dimension" looks good to me. So would that be sth. that you'd say "could be implemented upon user demand"? I'd be totally fine with that.
Also here for the wording of the action, I think "allow array_constraint to span lines" should just read "don't care" with the implication that VSG really doesn't care - and simply leave it to the record constraint setting to format from the character after the closing parenthesis onward. (I'm happy we made this picture, it's super useful! :+1: )
Hey @staerz,
I'm happy we made this picture, it's super useful!
Completely agree. Makes we wonder why it took so long to come up with. I'm sure it would have made some of our earlier conversations easier. It provides so much clarity.
For the array_constraint though I see you dropped the options "separated_parenthesis" and "extra-separated_parenthesis". Not sure if I would like to use one of these actually, possibly not as "one_line_per_dimension" looks good to me. So would that be sth. that you'd say "could be implemented upon user demand"? I'd be totally fine with that.
Yeah, I figured I would keep the variations to a minimum for now. Nothing worse than writing and testing code that nobody uses.
As for the previous_line
question, what if I just make the options more descriptive.
option | symbol | options |
---|---|---|
record_constraint_open_paren | green diamond | add_carriage_return_before, remove_carriage_return_before, ignore_carriage_return_before |
record_constraint_close_paren | red penta-star | add_carriage_return_before, remove_carraige_return_before, ignore_carriage_return_before |
record_constraint_comma | purple hexa-star | remove_carriage_return_before, ignore_carriage_return_before |
record_constraint_element | orange triange | add_carriage_return_before, remove_carriage_return _before, ignore_carriage_return_before |
array_constraint | grey box | all_in_one_line, one_line_per_dimension, ignore_formatting |
--Jeremy
Hi @jeremiah-c-leary,
I would not overdo it, i.e. not have the options be sentence-like values. That just increases the chance for adding a typo and having users wonder why it wouldn't work as they'd expect.
We won't get around adding good documentation, i.e. a description for each option. To avoid repeating what the option name said, I'd rather keep it short, so I liked what we had before, just with "previous_line" being called "attached" instead.
You keep the notion of just focusing on the new line though in your ignore option which I think is a mistake. I would say, just like one can disable rules, one should be able to disable certain configuration options. Disabling rules is done by setting "disable" to "true" and disabling configuration options should just be done by "ignore". (Same value name for all configurations no matter which rule!)
So that would also apply to the array_constraint
option: Make it ignore
and not ignore_formatting
(everything VSG is looking out for is formatting, so that's in it by definition).
Hey @staerz,
You keep the notion of just focusing on the new line though in your ignore option which I think is a mistake. I would say, just like one can disable rules, one should be able to disable certain configuration options. Disabling rules is done by setting "disable" to "true" and disabling configuration options should just be done by "ignore". (Same value name for all configurations no matter which rule!)
I agree. Seeing an undocumented rule I have been "following" stated clearly really helps. Thanks.
I would not overdo it, i.e. not have the options be sentence-like values. That just increases the chance for adding a typo and having users wonder why it wouldn't work as they'd expect.
We won't get around adding good documentation, i.e. a description for each option. To avoid repeating what the option name said, I'd rather keep it short, so I liked what we had before, just with "previous_line" being called "attached" instead.
That makes sense. I cannot fully convey what the option does in the option name itself. I will need documentation with examples to really describe what the options and their possible values accomplish.
option | symbol | values | actions |
---|---|---|---|
record_constraint_open_paren | green diamond | add_new_line, remove_new_line, ignore | insert, remove, allow carriage return before open paren |
record_constraint_close_paren | red penta-star | add_new_line, remove_new_line, ignore | insert, remove, allow carriage return before close paren |
record_constraint_comma | purple hexa-star | remove_new_line, ignore | remove, allow carriage return before comma |
record_constraint_element | orange triange | add_new_line, remove_new_line, ignore | insert, remove, allow carriage return before element |
array_constraint | grey box | all_in_one_line, one_line_per_dimension, ignore | combine array_constraint into single line, put each dimension on it's own line, allow array_constraint to span lines |
Why do I feel the need to have new_line
somewhere in the values? I suppose the value could be something more complicated than just adding or removing carriage returns. For example the all_in_one_line
vs one_line_per_dimension
.
What do you think of the above table?
--Jeremy
Hi @jeremiah-c-leary,
I'm generally happy with the table, in particular I'm happy with columns option, symbol and values.
For actions I would though recommend to separate the explanations for the "add_new_line/remove_new_line" setting vs. the "ignore": I think my previous comment on it wasn't too clear:
I hope that our understanding is the following:
It is mainly on the last point: Also spaces are ignored.
Given that all 4 record constraints use the same 3 values (comma is the exception), it would possibly be not such a bad idea to split the option column into one column that identifies the "defining structural element" (which would be: "opening parenthesis", "closing parenthesis", "comma", "new element") and the actual "description".
It could look like this:
option | symbol | values | structural element | description |
---|---|---|---|---|
record_constraint_open_paren | green diamond | add_new_line, remove_new_line, ignore | opening parenthesis |
|
record_constraint_close_paren | red penta-star | add_new_line, remove_new_line, ignore | closing parenthesis | |
record_constraint_comma | purple hexa-star | remove_new_line, ignore | comma | |
record_constraint_element | orange triangle | add_new_line, remove_new_line, ignore | new element | |
array_constraint | grey box | all_in_one_line, one_line_per_dimension, ignore | array range indication | combine array_constraint into single line, put each dimension on it's own line, allow array_constraint to span lines |
(and here's the HTML for that as markdown doesn't support the row span:
<table>
<thead>
<tr>
<th>option</th>
<th>symbol</th>
<th>values</th>
<th>structural element</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>record_constraint_open_paren</td>
<td>green diamond</td>
<td>add_new_line, remove_new_line, ignore</td>
<td>opening parenthesis</td>
<td rowspan=4>
<ul>
<li>The setting "add_new_line" enforces a carriage return (alias "new line") [and, consequently by indentation rules kicking in, also (indirectly) enforces the indentation of the new line]</li>
<li>The setting "remove_new_line" enforces the removal of any potential space and carriage return</li>
<li>The setting "ignore" disables the option and hence no formatting check is done at all: spaces and new lines can be anything</li>
</ul>
</td>
</tr>
<tr>
<td>record_constraint_close_paren</td>
<td>red penta-star</td>
<td>add_new_line, remove_new_line, ignore</td>
<td>closing parenthesis</td>
</tr>
<tr>
<td>record_constraint_comma</td>
<td>purple hexa-star</td>
<td>remove_new_line, ignore</td>
<td>comma</td>
</tr>
<tr>
<td>record_constraint_element</td>
<td>orange triangle</td>
<td>add_new_line, remove_new_line, ignore</td>
<td>new element</td>
</tr>
<tr>
<td>array_constraint</td>
<td>grey box</td>
<td>all_in_one_line, one_line_per_dimension, ignore</td>
<td>array range indication</td>
<td>combine array_constraint into single line, put each dimension on it's own line, allow array_constraint to span lines</td>
</tr>
</tbody>
</table>
)
So that would be what I'd put.
Now from the row symbol I get that the documentation would also contain the picture. Here I'd only recommend to remove the white space between the array element and the range indication. It looks a bit odd with so much space. (A little space just to nicely put the grey box in without squeezing is fine.)
Hey @staerz,
I hope that our understanding is the following:
- The setting "add_new_line" enforces a carriage return (alias "new line") [and, consequently by indentation rules kicking in, also (indirectly) enforces the indentation of the new line]
- The setting "remove_new_line" enforces the removal of any potential space and carriage return
- The setting "ignore" disables the option and hence no formatting check is done at all: spaces and new lines can be anything
It is mainly on the last point: Also spaces are ignored.
That matches with my understanding.
Given that all 4 record constraints use the same 3 values (comma is the exception), it would possibly be not such a bad idea to split the option column into one column that identifies the "defining structural element" (which would be: "opening parenthesis", "closing parenthesis", "comma", "new element") and the actual "description".
We can add the structural element
to the table. I do like the way it looks with the spanned rows for the description. It works because the options for each row do the same thing, just on different structural elements. I will need to see how to span rows in restructured text.
Now from the row symbol I get that the documentation would also contain the picture. Here I'd only recommend to remove the white space between the array element and the range indication. It looks a bit odd with so much space. (A little space just to nicely put the grey box in without squeezing is fine.)
I agree. I will recreate the picture without the spaces and move the grey boxes accordingly.
Well, it seems like we have an agreement on the options and what they do. Here is my plan going forward:
1) Document the options for the rule in VSGs sphinx documentation 2) Review the documentation with you to get concurrence and update as necessary 3) Begin implementing the options.
Regards,
--Jeremy
Hi @jeremiah-c-leary,
in case the table with spanning rows poses a problem in structured text we might also not use a table and rather use an approach with sections with a reference to a common description.
But yes, fully agreed, good documentation is sth. that is largely underestimated, so let's see that me make a good job here as well.
Alright, so let me know when ever you have sth. that you'd want my feedback on!
I'm really happy to see us get here so far and I hope and think that addressing this issue would be a major improvement to VSG!
Hey @staerz,
I toke a first attempt at adding documentation for this at: https://github.com/jeremiah-c-leary/vhdl-style-guide/blob/issue-539/docs/configuring_multiline_constraint_rules.rst
I think it needs some work, but the basic structure is there.
Let me know what you think and if there is anything I should change.
Thanks,
--Jeremy
Hi @jeremiah-c-leary,
looks good, I found a few minor details:
I'd check the other examples once the above has made it's way into the doc if that's fine with you (taking a step at a time...)
Hey @staerz,
I believe I addressed your comments above.
Thanks,
--Jeremy
HI @jeremiah-c-leary,
thanks. I think the iteration using letters in the 2nd dimension is not bad at all :smile:
Yes, now I agree to the indentation of the 1st example.
Looking into the 2nd example I noticed that even the figure that we use has an inconsistency in the indentation!
sig8
itself is on char 1 (so with the first char of signal
)element2
is indented and not aligned with the first char of element2
Now I think that would be subject of indentation rules, but I'd say that only 1 level of indentation per level of structure should be done, hence pushing the opening parenthesis back to the left.
The third example doesn't look right yet to me, shouldn't elementA
and elementB
now be on the same line?
So far for now!
Hey @staerz,
Looking into the 2nd example I noticed that even the figure that we use has an inconsistency in the indentation!
Would you agree the indentation of this figure:
The third example doesn't look right yet to me, shouldn't elementA and elementB now be on the same line?
Yeah, there were a lot of issues with the format. I have updated it.
Let me know what you think of the updates.
Thanks,
--Jeremy
Hi @jeremiah-c-leary,
yes, I agree to the figure (I see you left intentional space before the comma in order to inject the symbol).
I also agree to the formatting of the 3rd example "record_constraint_element set to 'remove_new_line'".
Note that still the indentation of the 2nd example needs to shift back the 2nd level indentation (for the opening of the parenthesis and element-letter).
And one remark for all: Maybe it would look better if you put the keyword (option name and value names) into monospace font (in headlines and in descriptive text) of the examples ... and in the table as well for consistency.
Hey @staerz,
I updated the figure and the indenting in the 2nd example.
I also monospaced the option name and values. I am not so sure on the results though.
Let me know what you think.
Thanks,
--Jeremy
Hi @jeremiah-c-leary,
thanks!
Yeah, I can understand your hesitation towards having the option names and values in monospace but from my experience it's a matter of getting the habit. It's nothing that I'd count indispensable, so if you really don't like it, revert it. [note that for consistency, the entire documentation should then possibly be reworked in that aspect and I'm not sure if that's worth doing ... esp. if you don't like it! ... so consider it an idea, but not all my ideas are unbeatable]
The 2nd example (record_constraint_close_paren
) looks good now.
Re-looking at the 3rd example (record_constraint_element
): You seem to have changed the indent from 2 spaces to 4 spaces. I would avoid that and also use 2 spaces here for the sake of leaving the confusion of the user to a minimum.
But parenthesis and line breaks look good to me.
Coming to the 4th example (array_constraint
set to one_line_per_dimension
):
I'd say here we have to agree on what the indent should be: I always argue for 1 level per level, but in addition we have the array constraint that also needs some visible indent. I would suggest to simply de-indent (back to the level of the element, the 2nd level opening parenthesis, so from this
signal sig8 : record_type_3
(
element1
(7 downto 0),
element2
(4 downto 0)
(7 downto 0)
(
elementA
to this
signal sig8 : record_type_3
(
element1
(7 downto 0),
element2
(4 downto 0)
(7 downto 0)
(
elementA
(the difference is in the last 2 lines only)
So signal
is at "0-indent", element-number
is at "1-indent", element-letter
is at "2-indent".
Clearly the opening parenthesis for the array constraint would be 1 indent more than the opening parenthesis for the record but that assures that the array constraint is clearly visible and the difference to the structural parenthesis is clear as well.
Let me know what you think.
Hey @staerz,
Yeah, I can understand your hesitation towards having the option names and values in monospace but from my experience it's a matter of getting the habit. It's nothing that I'd count indispensable, so if you really don't like it, revert it. [note that for consistency, the entire documentation should then possibly be reworked in that aspect and I'm not sure if that's worth doing ... esp. if you don't like it! ... so consider it an idea, but not all my ideas are unbeatable]
I think it is starting to grow on me. It seemed odd to use monospace in the titles of the examples. Looking at it again, I can see how using monospace makes those items pop out in the text.
Re-looking at the 3rd example (record_constraint_element): You seem to have changed the indent from 2 spaces to 4 spaces. I would avoid that and also use 2 spaces here for the sake of leaving the confusion of the user to a minimum. But parenthesis and line breaks look good to me.
Yes, I see that now. I adjusted it to 2 spaces.
Clearly the opening parenthesis for the array constraint would be 1 indent more than the opening parenthesis for the record but that assures that the array constraint is clearly visible and the difference to the structural parenthesis is clear as well.
I see your point on indenting the dimensions but not the record constraint. I'm sure we can make it configurable for those who would like a different indent.
I pushed an update with the indent changes to the third and fourth example.
I think we could be done with the documentation updates.
Let me know what you think.
--Jeremy
Hi @jeremiah-c-leary,
I've looked over it again and confirm that now the 4th example looks good to me as well.
Yeah, as you mention, indentation of the parenthesis is another thing but I would actually not foresee a configuration for that, it would only make it inconsistent. The only useful case I'd see is if the array constraint on a new line should be indented one level more than the element it belongs to or not (and hence be aligned with the opening parenthesis of the content to follow.
Other than that we're done documenting, with the occasional inconsistency which now still exists regarding the monospace of the values in the table in the values and description column. Fix that or don't, depending how much monospace for these appeals to you (and yes, making it more stick out from the rest is the clear benefit. Now if it has light grey background or not is really up to the style sheet and not much of a concern, at least not to me).
Hey @staerz,
Just wanted to give you an update.
It looks like I am able to parse and classify our example code correctly which is the precursor to creating the options we defined. I still need to convince myself that it is working though. The recursion through my classifier hurts my head sometimes.
I also need to see if I can break my code with more complex examples.
--Jeremy
Hi @jeremiah-c-leary,
thanks for the heads-up! Yeah, I can imagine your headache. Unfortunately I don't have any more examples in my code that'd get more complicated than what I posted in the issue description. If it's of any help I could though come up with sth. crazy random for your testing.
Hey @staerz,
The issue I am currently facing is knowing when I have run into an array_constraint
or a record_constraint
.
I used this pattern to detect a record_element
: [open_paren, token, open_paren]. Which works well for simple range
definitions like:
(element1(7 downto 0), element2(7 downto 0))
However, I started chasing down the array_constraint
and I eventually run into the range
definition, which breaks out into this:
If you follow the path from range
you go through term
, then factor
and come to primary
. Here is the definition of primary
:
primary ::=
name
| literal
| aggregate
| function_call
| qualified_expression
| type_conversion
| allocator
| ( expression )
name
would not be an issue as the pattern would be [open_paren, name, 'downto']
literal
would not be an issue as the pattern would be [open_paren, literal, 'downto']
aggregate
would not be an issue as the pattern would be [open_paren, open_paren]
qualified_expression
would not be an issue as the pattern would be [open_paren, token, tick]
allocator
would not be an issue as the pattern would be [open_paren, new
]
( expression )
would not be an issue as the pattern would be [open_paren, open_paren]
However, both functional_call
and type_conversion
would be a problem.
Their patterns would be [open_paren, token, open_paren]. This happens to match my record_constraint
pattern check.
One way to resolve this would be to keep track of function and type names as I process a file. Unfortunately, these can be placed into a package and VSG only processes a single file at a time.
At the moment I am stuck, but I will be thinking of some solution to this. If you have any ideas, please let me know.
Regards,
--Jeremy
I suppose one option is to have a rule that is very limited to how it is applied:
For example I just ran my classifier on the following code:
signal sig9 : record_type_3(func1(A, 5, 6) downto func2(4, 6, 8));
and it was classified as:
signal <class 'vsg.token.signal_declaration.signal_keyword'>
sig9 <class 'vsg.token.signal_declaration.identifier'>
: <class 'vsg.token.signal_declaration.colon'>
record_type_3 <class 'vsg.token.subtype_indication.type_mark'>
( <class 'vsg.token.record_constraint.open_parenthesis'>
func1 <class 'vsg.token.record_element_constraint.record_element_simple_name'>
( <class 'vsg.token.index_constraint.open_parenthesis'>
A <class 'vsg.parser.todo'>
, <class 'vsg.token.index_constraint.comma'>
5 <class 'vsg.parser.todo'>
, <class 'vsg.token.index_constraint.comma'>
6 <class 'vsg.parser.todo'>
) <class 'vsg.token.index_constraint.close_parenthesis'>
downto <class 'vsg.token.record_element_constraint.record_element_simple_name'>
func2 <class 'vsg.token.record_element_constraint.record_element_simple_name'>
( <class 'vsg.token.index_constraint.open_parenthesis'>
4 <class 'vsg.parser.todo'>
, <class 'vsg.token.index_constraint.comma'>
6 <class 'vsg.parser.todo'>
, <class 'vsg.token.index_constraint.comma'>
8 <class 'vsg.parser.todo'>
) <class 'vsg.token.index_constraint.close_parenthesis'>
) <class 'vsg.token.record_constraint.close_parenthesis'>
; <class 'vsg.token.signal_declaration.semicolon'>
Everything is fine until the first open parenthesis. It is classified as a record_constraint
, then everything else afterwards is misclassified until the semicolon.
Hmmm....I wonder if I could use the downto
or to
in the range
definition as part of my check.
range ::=
range_attribute_name
| simple_expression direction simple_expression
So if I find a direction
at the same "level" as the open parenthesis then I know I am in an array_constraint
.
This could work. I will think about it some more.
--Jeremy
Well, things were going well. I was able to detect the direction
and then classify accordingly. However, I ran down the range_attribute_name
:
and one of the options for prefix
is a functional_call
. So now I am back to needing to know if a token is a function name.
I will need to think on this one...
--Jeremy
Hi @jeremiah-c-leary,
thanks for poking thoughts around here. I see it's not trivial at all!
Just looking from a possibly more external view, I'd say the structure defining a record constraint which one would always find is the comma surrounded by parenthesis (but making sure not to confuse it with any other internal structure). It could then be multiple commas actually.
For the detection of the direction I'd say that this might be even more difficult, note that one might also use the 'range
attribute. In that case one wouldn't see a to
or downto
to identify a range. (I think the combination of finding one of the 3 would be covering it though, I cannot think of a 4th way identifying a range.)
Hey @staerz,
Just looking from a possibly more external view, I'd say the structure defining a record constraint which one would always find is the comma surrounded by parenthesis (but making sure not to confuse it with any other internal structure). It could then be multiple commas actually.
Are you suggesting using a comma
to determine a record_constraint
. Like this:
element1(elementA(7 downto 0), elementB(7 downto 0))
For the detection of the direction I'd say that this might be even more difficult, note that one might also use the 'range attribute. In that case one wouldn't see a to or downto to identify a range. (I think the combination of finding one of the 3 would be covering it though, I cannot think of a 4th way identifying a range.)
I could detect the 'range
and then know I have an array_constraint
.
I find the fact you can have a function return a range to be intriguing. Under what circumstances would you do such a thing?
The core issue is I can't tell the difference between these:
record_element ::= ( record_element_simple_name ( ...
array_constraint ::= ( function_name ( ...
array_constraint ::= ( type_mark ( ...
VSG works on a single file at a time without any knowledge of other files. This has some benefits:
1) The code just needs to be syntactically correct 2) Code does not need to compile 3) Can run in an editor easier
...and some drawbacks:
1) Can not fully parse and classify code (current example) 2) Consistent case rules can not always be applied
It seems the only way for VSG to overcome this limitation is to feed it enough files for it to pull all the function names and type names. There would have to be a configuration somewhere that pulls in all the files. That would give us the ability to handle more rules. However, it would also increase the time it takes to run VSG on a single file. Could you image if you passed it all the files in your project (lets say 100) just to have it run on a single file? It might just become unusable at that point.
I'm thinking maybe I could save some key information, like packages and the types in them, in an object and then pickle it for later use. VSG would then pull in the stored object with all the references which it could then apply to the single file. VSG would also need to know if the stored information is out of date so it could re-parse it to ensure the information is up to date.
Another option would be to allow for the current parsing method, which just ignores the issue, and also implement a "real" parsing method if the function_name
and type_mark
exist. I suppose I would also need to keep track of record_element_simple_name
s in order to ensure I have a record element.
Side thought...I suppose you can't have a record_element_simple_name
, function_name
, and type_mark
be the same value.
I was thinking of making taking VSG down this path, but now I am not so sure.
Maybe there are just inherent limitations with VSG, and maybe that is okay.
I would like it if it solved 100% of the formatting problems, but is 95% good enough? Is 90% good enough?
Do I take VSG down the path where the code has to be compilable?
It is nice to be working in my editor and just completely ignoring style. Then I hit a macro key and poof, the code is formatted, regardless of whether it will compile.
Maybe VSG could operate differently on files depending on the amount of information it has available? Maybe some rules are only executed when there is enough information?
I really do not know.
Any opinions?
Regards,
--Jeremy
Hi @jeremiah-c-leary,
Are you suggesting using a comma to determine a record_constraint. Like this:
Yes. To my understanding it is the comma that sanely identifies a record as a record (without knowing its definition). There is one exception though: If the record only had one defined element.
But then the rule to put a new line after the comma or now (which is what we are after here primarily) doesn't apply at all and I honestly think that this is not going to appear in any code (I don't see why one would define a record with just one element).
I find the fact you can have a function return a range to be intriguing.
I think you misunderstood my point. VHDL doesn't have a type for a range
, a range can hence never be retrieved from a function or in any other likewise way.
What VHDL allows though is the 'range
attribute which any ranged element would have and that is sth. that I just wanted to point you to on your mission to identify a range: It could be either by the to
, downto
or 'range
keyword. Note that the 'range
keyword would though be following another signal's or constant's name.
Let me give you an example that I'd have in mind:
constant take_my_range: std_logic_vector(4 downto 2);
signal avst_packet_o : t_avst_packet(
data(take_my_range'range),
empty(AVST_EMPTY_W - 1 downto 0),
error(0 downto 0)
);
The above is perfectly valid VHDL code (and I think I'm using it here and there (at least the concept of picking the range from a constant or another signal), so VSG should be supporting that :wink:
The core issue is I can't tell the difference between these:
I'd say forget about identifying this purely based on positional relations of parenthesis. Identifying related pairs of parenthesis and excluding it from the list of candidates so to say would rather be useful, but that's it.
As said before, the record element is identified by the comma while the array constraint is identified by a true range (defined via to
, downto
or 'range
.
It seems the only way for VSG to overcome this limitation is to feed it enough files for it to pull all the function names and type names.
Clear no, forget about that. This will not work unless the entire code base is passed to VSG and that is rarely the case. In particular that would not work in my use case where only fractions of the overall code are passed to VSG independent of each other. VSG must be able to parse a file on its own without knowledge of anything else.
Identifying structure based on parsing just one VHDL file is the way to go.
(If you went any further than that you'd basically do the first job of a synthesis tool that tries to make sense of the code and put things together (instantiations, function definitions and function calls, etc.). If you ask me, no, that's not what we should aim for.)
I suppose you can't have a record_element_simple_name, function_name, and type_mark be the same value.
I'm not sure but at least one can overload functions which would make its signature different (more or less commas, different types of parameters) so that's a dead end.
Do I take VSG down the path where the code has to be compilable?
Strong NO. Again, that would be the job of a synthesiser and VSG may not have access to all required code for that. (It will not in my case)
Then I hit a macro key and poof, the code is formatted, regardless of whether it will compile.
That's a strong yes. If it compiles or not doesn't matter, you (usually) need more than one file in order to answer that question and it's the synthesiser which does that.
Any opinions?
So I think that's a lot of opinion and I hope that it was useful in any way.
Hey @staerz,
Yes. To my understanding it is the comma that sanely identifies a record as a record (without knowing its definition). There is one exception though: If the record only had one defined element.
Unfortunately, I do not believe we can use the comma as both the array_constraint
and record_constraint
can have a comma.
As said before, the record element is identified by the comma while the array constraint is identified by a true range (defined via
to
,downto
or'range
.
I think you are close. We can not use the comma, but I do believe we can use downto
, to
and '
:
So essentially, when I check for an array_constraint
or an element_constraint
, I search for downto
, to
and a '
at the same "level" as the open paren. I will then follow these rules:
1) If I find any of them before I reach the matching closing parenthesis, then I have found an array_constraint
.
2) If I do not find any of them before I reach the matching closing parenthesis, then I have found an element_constraint
.
So I think that's a lot of opinion and I hope that it was useful in any way.
I did not really think through the ramifications of forcing code to be compilable until now. I believe it reduces the usability for only a small gain in enforcing consistent case rules across multiple files.
Thanks for your feedback. It provided the clarity I needed.
I am going to add checking for the '
for the array_constraint
and check some more test cases to see if there is anything else lurking around.
Regards,
--Jeremy
Is your feature request related to a problem? Please describe. The rule signal_016 requires a signal declaration to be on 1 single line. In case of simple signals (
std_logic
or alike) this is very reasonable, but for signals that are structures (records), this is not always meaningful and finally conflicts withlength_001
.The problem is hence self-consistency.
Describe the solution you'd like The solution would be to extend
signal_016
to support multi-line signal declaration, following paranthesis-parsing (like for functions, signal assignments or alike.Describe alternatives you've considered There is no alternative other than disabling the rule.
Additional context Here is an example of a signal declaration of a more complex type:
with the underlying type declaration (which poses no problems):
Note that actual data length is intentionally not defined in the type declaration (VHDL-2008) to give flexibility.