soctrace-inria / framesoc

Framesoc is a generic trace management and analysis infrastructure.
12 stars 0 forks source link

Unexpected display of Pajé trace #191

Open frs69wq opened 8 years ago

frs69wq commented 8 years ago

Hi

I just gave a try at using framesoc to visualize Pajé traces produced by SimGrid. I used one of our examples (examples/msg/trace-process-migration) that moves a process from host to host. The trace is procmig.txt

Here is the Gantt chart displayed by framesoc in which the "emigrant" process seems to always be located on the "Bourassa" host gantt-framesoc

The expected output for this trace (as displayed by ViTE) should be gantt-ViTE

Did I missed something in the configuration of Framesoc or is this an bug in the display of Pajé traces?

Best F.

ycorre commented 8 years ago

Hi,

It is a known problem that SimGrid generates paje traces that are not fully compliant with the guideline of the latest version of the paje trace format (although I think modifications were made in SimGrid but I can't remember if all the problems were fixed). We have a wiki page listing the guidelines for importing paje traces into Framesoc.

In the trace you provided, the problem is that several processes (or containers in pajé terminology) have the same name ("emigrant1"), which results in all those processes being more or less merged into a single one.

Regards, Youenn

mquinson commented 8 years ago

That's strange since this trace file is correctly displayed in Vite for example...

@schnorr do you know more about how traces produced by SimGrid were modified at some point to stick to the FrameSoc guidelines, maybe?

ycorre commented 8 years ago

That's strange since this trace file is correctly displayed in Vite for example...

That's because Framesoc does not import directly the pajé trace, but a .csv version of it created with the pj_dump utility. In this csv version, the containers are identified by their names, so if there are several containers with the same name, they will be merged. I have attached the pj_dump version of the trace provided in the first message: procmig_pjdump.txt

Also I am not sure about the modifications in SimGrid: I might be confusing it with another tool, but since I no longer have access to the mailbox I was using while I was working on the project, I can't check the name of the tool.

.

frs69wq commented 8 years ago

If you look at lines 48 to 54 of https://github.com/soctrace-inria/framesoc/files/268960/procmig_pjdump.txt, you can see that all the information is there to display that the "emigrant-1" States is first associated to the "Bourassa" container and then to the "Ginette" container. IMHO, it looks like framesoc associates all the States to the first encountered container and then never change it. This would also explain why the states are displayed on Bourassa which is neither the first, nor the last host visited by this moving process. However it is the first one in the trace.

schnorr commented 8 years ago

@mquinson The process migration trace seems to do exactly how we have implemented a couple of years ago. It uses container aliases to uniquely identify each container, using different container aliases for the same container name, as allowed by the paje specs. When the correct trace file is simulated with pj_dump, the output disregards all aliases information. We could dump to CSV the aliases, since the information is there, but aliases have been designed simply to save space. Perhaps framesoc should read from paje trace files instead of CSV files, which is a format not written in stone as it is in the Paje specification: https://github.com/schnorr/pajeng/blob/master/doc/lang-paje/lang-paje.pdf Vite works because it reads Paje trace files, taking into account aliases. As far as I know, SimGrid generates Paje traces that are fully compliant to the Paje specification. If that's the case, show me the offending paje trace file created with simgrid that can't be pj_dump'ed.

dosimont commented 8 years ago

@schnorr We prefered to reuse pj_dump instead of implementing ourself a Pajé importer for several reasons:

For the moment, nobody is actively working on Framesoc, since @youenn, @generoso and me are not involved in the SoC-Trace project anymore, but we can allocate a few time to solve bugs or correct important issues. Modifying pj_dump's output may be faster (we will have to change this in the pj_dump importer anyway) than rewriting a new importer from scratch.

schnorr commented 8 years ago

Hi @dosimont and perhaps @ycorre and others. When you say you are using pj_dump you in fact is using its output (the CSV file). That's not a good choice because you are dealing with something that is no longer in the Paje format, but something else (you in fact loose some information as we have seen in this example given by @frs69wq ).

Considering this, you perhaps could be interested by the work of @taisbellini that has just graduate in computer science here at UFRGS by implementing a Paje simulator in Java, building from scratch a parser for Paje trace files using the neat JavaCC tool. All the code and performance evaluation we have conducted is on GitHub (you can also check her LabBook.org). The link:

https://github.com/taisbellini/aiyra

Aiyra stands for daughter in the Guarani language and it is designed to be extended by implementing plugins in the java language. I and of course @taisbellini can give you further details if you decided to proceed using it. It should be very simple since you can have access to the exact same information you already have through pj_dump, but following the rigor of the Paje format. She developed another pjdump-like tool using her simulator. It works perfectly as far as we know (you should not use its output, but instead using the plugin infrastructure). The only thing unsupported right now is extra fields, but that's pretty easy to add according to @taisbellini.

dosimont commented 8 years ago

I understand your point. I'm wondering, however, the purpose of pj_dump's csv format, if it's not totally coherent with the original Pajé trace file. I thought it was a more convenient way to represent a Pajé trace (csv is indeed easier to parse, and since states and links are already computed, we avoid redoing this operation each time we import the trace into Framesoc), but without any information loss.

I think the best temporary solution, for the moment, would be to differentiate the migrating processes by using two different names (adding a suffix to the name, for instance), until we start a new Pajé importer, either reusing @taisbellini 's work, or starting it from scratch (depending on the difficulty to adapt it to Framesoc). Of course, if one of you has an unoccupied student, don't hesitate to assign him/her with this task ;-)

frs69wq commented 8 years ago

Well If I look back at the trace that led to this issue, all the information is in the csv produced by pjdump:

1: Container, 0, HOST, 0, 18.1551, 18.1551, Bourassa 2: Container, Bourassa, MSG_PROCESS, 10.0872, 12.1126, 2.02544, emigrant-1 3: State, emigrant-1, MSG_PROCESS_STATE, 10.087178, 12.087178, 2.000000, 0.000000, sleep 4: State, emigrant-1, MSG_PROCESS_STATE, 12.087178, 12.112617, 0.025439, 0.000000, receive 5: Container, 0, HOST, 0, 18.1551, 18.1551, Ginette 6: Container, Ginette, MSG_PROCESS, 16.1385, 18.1551, 2.01655, emigrant-1 7: State, emigrant-1, MSG_PROCESS_STATE, 16.138521, 18.138521, 2.000000, 0.000000, sleep 8: State, emigrant-1, MSG_PROCESS_STATE, 18.138521, 18.155073, 0.016552, 0.000000, receive

However, I wonder which of the lines that start by 'Container' are parsed/used. As said in a previous comment, it seems that only line 2 is used to definitively associate "emigrant-1" to a host.

Le 31/05/2016 20:15, Damien Dosimont a écrit :

I understand your point. I'm wondering, however, the purpose of pj_dump's csv format, if it's not totally coherent with the original Pajé trace file. I thought it was a more convenient way to represent a Pajé trace (csv is indeed easier to parse, and since states and links are already computed, we avoid redoing this operation each time we import the trace into Framesoc), but without any information loss.

I think the best temporary solution, for the moment, would be to differentiate the migrating processes by using two different names (adding a suffix to the name, for instance), until we start a new Pajé importer, either reusing @taisbellini https://github.com/taisbellini 's work, or starting it from scratch (depending on the difficulty to adapt it to Framesoc). Of course, if one of you has an unoccupied student, don't hesitate to assign him/her with this task ;-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/soctrace-inria/framesoc/issues/191#issuecomment-222773858, or mute the thread https://github.com/notifications/unsubscribe/ACatOBqX901xnXpYIdxCGN3ZVuJ4qsGXks5qHHrWgaJpZM4IfCtB.

One should, for example, be able to see that things are hopeless and yet be determined to make them otherwise. -- F. Scott Fitzgerald

alegrand commented 8 years ago

Sweet, I didn't know taisbellini was reimplementing a parser in java. Out of curiosity, what's the motivation? Better interaction with other tools ? Portability ? I only had a quick look at the labbook but this was not sufficient to understand the whys and wherefores of this work.

dosimont commented 8 years ago

@frs69wq In this trace, because the containers migrate and two containers with the same name are never alive at the same moment, we could indeed associate the events with their actual container by comparing the timestamps of the container creation and destruction, and the event occurrence. However, according to the Pajé trace format specifications,

The use of aliases also enables for the definition of more than one container with the same name, by using one different alias for each container.

If I understand well, this offers the possibility to have several containers of the same name (but with different aliases) at the same moment (there is no limitation in the text about this point). In this case, comparing the timestamps would not help to distinguish the actual container of an event. So, we need either the aliases in the csv to be compliant with this specification, or to parse directly the Pajé trace, as recommended by @schnorr.

schnorr commented 8 years ago

@alegrand In the beginning, the motivation behind Aiyra was to create a very simple database model for the Paje format so we could use SQL requests (in R) instead of using CSV files to get complete entities. As a side effect, we'd like also some unix-like tools to get data from the database (like an SQL pjdump). The objective shifted when we saw the opportunity to create a plugin-based framework so others could rely on the javacc parsing @taisbellini has developed. As of today, one can simply extend the abstract class PajePlugin and implement the methods according to the plugin's objective (having access to simulated entities directly). We never thought about portability but this suits nice now.

schnorr commented 8 years ago

@dosimont @frs69wq Indeed, you can have multiple containers with the same with different aliases, possibly at the same time. The purpose of aliases in the Paje file format is to have a smaller reference to containers, reducing the file size. One very simple solution to this problem would be to make an option to pjdump so it exports the alias of the containers. Of course that the importer in framesoc should be adapted to that and users should be aware that only CSV files from pjdump with this option would be correctly loaded. I asked one time for @ycorre to develop a headless framesoc interface to execute the import of a file from the command-line. Wouldn't be the case to use that implementation as base for a new one that uses Aiyra's infra?

dosimont commented 8 years ago

@schnorr Do you plan to add a new field, or put the alias instead of the actual container name?

I asked one time for @ycorre to develop a headless framesoc interface to execute the import of a file from the command-line. Wouldn't be the case to use that implementation as base for a new one that uses Aiyra's infra?

I heard about that, but I don't know if this feature is currently available. If it's not the case, I'm afraid that nobody will have time to finish it. Regarding the relationship between this feature and the importers, I guess that this command-line interface would just execute the appropriate Framesoc importer with the good parameters, instead of using the Eclipse GUI. So, creating a new importer should just follow the rules and the interfaces imposed by Framesoc for the importers, and if the command-line feature has been correctly designed, it will be able to detect automatically this new importer and enable its utilization.

In think there is redundancy between Aiyra and Framesoc, since both do basically the same thing: convert a trace into a database. In practice, you could use Framesoc databases without passing through Framesoc, and directly perform SQL requests on them.