Open amicic opened 3 years ago
On a more general note, I have a feeling that concurrent-collection-start/end is overly complicated.
concurrent-collection-end is simply redundant and brings no value.
concurrent-collection-start (which actually is a 3 liner - see below) could probably be collapsed to one line. we don't have anything similar to report in GMP (in Balanced) and Concurrent Scavenger final increments.
Additionally 'start' from concurrent-collection-start is somewhat confusing. Someone might think of it as a concurrent cycle start, but in fact it's just the start of the final STW increment of a concurrent cycle. We already have a couple of other indicators it's a final STW (exclusive-start and gc-start) reported nearby.
<concurrent-collection-start id="800" timestamp="2020-10-19T07:18:51.676" intervalms="55891.125" >
<concurrent-trace-info reason="card cleaning threshold reached" tracedByMutators="656526016" tracedByHelpers="45082100" cardsCleaned="32133" workStackOverflowCount="0" />
</concurrent-collection-start>
Please can you add an ID label, and state that the following topic may need an update to the log snippet(s):
vgclog.md
Many Thanks
I’m in favour of removing concurrent-collection-end
as it doesn’t give any useful info. Exclusive is released right after reporting this, the exclusive stanza is sufficient for knowing the order of events. I also think concurrent-collection-start
can be collapsed, the only concern here is the stanza can get fairly long with all the attributes on the same line, especially with “reason” string
I also think element name concurrent-collection-start
is misleading, it hints at "concurrent" activity (with mutators), whereas it's pure STW
Good point about the long name. we could leave it as a multi-liner.
As for the rename of concurrent-collection-start, I'm struggling a bit, but how about concurrent-final, or better concurrent-mark-final or concurrent-global-final (since we may have concurrent scavenge as well)?
I think the name should have global
in it, so I like conurrent-global-final
. It shows a relationship to cycle-end that follows it:
<concurrent-global-final id="800" contextid="797" timestamp="2020-10-19T07:18:51.676" intervalms="55891.125" >
....
<cycle-end id="810" type="global" contextid="797" timestamp="2020-10-19T07:18:51.766" />
We don't mention "mark" in any global concurrent stanzas (e.g concurrent-kickoff), so I'm not for concurrent-mark-final
. I think it's best to leave it out.
Hi both, adding in from a user experience perspective...(@RSalman FYI I'm currently writing the OpenJ9 topic on verbose GC logs)
Are we able to use tag names for concurrent STW events that mean the user can immediately know what STW event is being referred to, without having to trawl through documentation? e.g. It is not obvious to a user that, when looking at the logs, the information being reported are STW events that exist within a concurrent cycle / a certain amount of background knowledge is assumed.
From what I understand, the way a user would analyse the logs for STW events in a concurrent cycle, would be to search for the word 'concurrent', so they can differentiate STW events of a STW cycle easily from the STW events of a concurrent cycle.
For the rest of the tag name, the actual tag related to the concurrent STW event in the concurrent cycle changes from policy to policy. The question is, what does the user want to know when she searches for 'concurrent', and how does this depend on the policy type?
As I see it, there are 2 things we can get across in the tag name to improve UX and to also ensure consistency across different
ii) include a word that links the STW event to the concurrent collection type. Which part of the collection definition is most important? that it is global or local, or that it is a mark & sweep etc.?
For this
<concurrent-global-final-STW-collection>
Thoughts?
After researching for writing the vgc log topic, I would like to discuss the user experience of the logs, to determine if we could improve this experience by renaming the concurrent tags, which would also lead to less and clearer documentation. I would like to do a bit of back-of-the-envelope user experience testing my end to get this right. Perhaps I can arrange a call with you both in the next couple of weeks? @SueChaplain @pmhayward FYI
The other question I had which users could find confusing: why is it that the <exclusive-start>
tag sometimes precedes the <concurrent...>
tag, and other times follows the <concurrent...>
tag? Is there a reason for this?
Are we able to use tag names for concurrent STW events that mean the user can immediately know what STW event is being referred to, without having to trawl through documentation? e.g. It is not obvious to a user that, when looking at the logs, the information being reported are STW events that exist within a concurrent cycle / a certain amount of background knowledge is assumed.
This is true, a certain amount of background knowledge is assumed. Anything between
From what I understand, the way a user would analyse the logs for STW events in a concurrent cycle, would be to search for the word 'concurrent', so they can differentiate STW events of a STW cycle easily from the STW events of a concurrent cycle.
I would say that it's not a straight forward search for "concurrent", it depends on what the user is looking for. A concurrent tag can show up in exclusive/STW (e.g concurrent-collection-start)
Just some background info to clarify To better understand the logs, we should look at cycles in terms of phases:
A cycle may have STW phases or Concurrent phases. For instance, with concurrent standard collectors/policies (gencon, optavgpause) global cycles are broken down into the following phases:
A verbose log would roughly look something like this for this cycle:
Concurrent kickoff tag //Cycle is about to start, some initialization work is being done concurrently
Exclusive Start //STW 1
Cycle Start
Exclusive End
Exclusive Start //STW 2
Concurrent Collection Start //This tag name is being changed here to concurrent-global-final
Cycle End
Exclusive End
For the rest of the tag name, the actual tag related to the concurrent STW event in the concurrent cycle changes from policy to policy. The question is, what does the user want to know when she searches for 'concurrent', and how does this depend on the policy type?
- For gencon, user needs to know whether this is the first STW event, or the final collection. What else?
- For balanced, user needs to know whether this is a tag that simply records an intermediate snapshot of the heap, or whether the tag is recording a concurrent STW event.
- For other policies, what does the user need to know? What concurrent events are logged?
@amicic could answer this better. Perhaps a call would be better to go over this and some of the other points you've mentioned.
The other question I had which users could find confusing: why is it that the tag sometimes precedes the
tag, and other times follows the tag? Is there a reason for this?
Not sure if I understand the question, which tag are we referring to?
The other question I had which users could find confusing: why is it that the tag sometimes precedes the
tag, and other times follows the tag? Is there a reason for this? Not sure if I understand the question, which tag are we referring to?
My bad Salman, I was referring to the <exclusive-start>
tag sometimes preceding a
As I see it, there are 2 things we can get across in the tag name to improve UX and to also ensure consistency across different tags and different policies: i) Include 'STW' explicitly in the name, so that user knows immediately what part of the concurrent cycle they are looking at (also vs. any other tags that do NOT label a STW event). ii) include a word that links the STW event to the concurrent collection type. Which part of the collection definition is most important? that it is global or local, or that it is a mark & sweep etc.?
For this tag modification, given the above, is there a problem with a tag being this long:
Indeed, some inconstancy can be observed when reporting STW and concurrent phases of a cycle
As Salman already said, STW event can be found from exclusive-start/end. And yes, we must assume some general GC background knowledge and reader getting familiar with our log specifics. So, I don't see much need to add that STW in more tags name. concurrent-global-final is very good compromiseto me. It's just saying why we triggered this STW phase: to end the ongoing cocurrent global cycle (as opposed to hiting allocation failure), but it's already clear it's a STW.
Generally, we have very little room from changing existing tag names, since we could break tools that parse the logs (such as Eclipse GCMV). concurrent-collection-start/end are be some of very rare ones
Hi @amicic @RSalman has this defect been put into an iteration plan yet? - I am wondering when the following changes will be actioned, so I can determine whether to keep the current log contents in the openJ9 docs for the next release, or to use the changed log content? - will the following be actioned by 15th Jan?
Many thanks.
Sorry @MelHopper I missed your question, changes went in earlier this month to address this issue. @amicic could you please close this.
While concurrent-collection-start/end are attached to cycle-end that does have contextid, for completeness it's probably wise to add it to concurrent-collection-start, as well.