micromouseonline / mazefiles

A set of classic micromouse (マイクロマウス) maze files in text format
40 stars 16 forks source link

Proposed file format change #7

Closed micromouseonline closed 5 years ago

micromouseonline commented 6 years ago

After using the files for a while, I think the usefulness could be improved with a modification to the format:

Goal Area: For both the classic and half size mazes, the goal is not a single cell but a rectangular region. Neither contest specifies that there should be only a single entrance to that area and so it makes sense to mark ALL the goal area cells with the letter 'G'.

Meta Data - Size: Users of the text mazes must examine the entire data file before they know the dimensions of the maze. An assumption can be made that the maze is square and one of two pre-determined sizes but that is not a given. Some contests do not use a full 16x16 or full 32x32 maze. Practice mazes may be any size. While it is possible to fit any contest maze inside a 32x32 footprint, it is possible to always make this work. However, it would be more useful if the maze dimensions were recorded in the file. If they are at the start of the file, the parser has useful information to help it on its way and could also dynamically allocate only as much memory as would be needed for the actual maze represented in the file.

Meta Data - comment: sometimes a maze needs a descriptive comment. This may be just the name in a less cryptic format than the filename. Also, there is nothing in the filename that tells the user that this is from a classic or half-size contest/event. Nor can that information be inferred from the maze size since 16x16 half size mazes may be used for qualifying rounds or smaller events. A free text comment field at the start of the maze will aid the user in identifying its use and purpose.

Proposed Change A single comma delimited line should be added as the first line in the file. This line would contain the meta data described above in the format width,height,description width and height are both integer values normally in the range 2-32 description is a non quoted ascii text string. This would normally be less than 120 characters in length so that the line is not longer than the printed width of a maze. A description should not include a comma as this will be interpreted as a delimiter

A possible maze file might then contain:

8,7,Small Home Test Maze
o---o---o---o---o---o---o---o---o
|                               |
o   o   o---o---o---o---o---o   o
|   |                           |
o   o---o---o---o---o---o---o   o
|   |         G   G   G |       |
o   o   o---o   o   o   o   o---o
|   |   |   | G   G   G |       |
o   o   o   o---o---o---o---o   o
|   |   |                   |   |
o   o   o---o   o---o---o   o   o
|   |                   |   |   |
o   o   o---o---o---o   o   o   o
| S |   |               |       |
o---o---o---o---o---o---o---o---o

Existing code could adapt to the change by simply examining the first character of the file to see if it is a digit since the zero character is not a good choice for a post. If no digit is found, parsing can continue as before. Otherwise, the line could be skipped and parsing continue on the second line. Or, of course, the metadata could be used to guide a more extensively modified parser.

If you have suggestion for, or objections to, this change please let me know as soon as possible.

@kangzhangqi do you use these files?

Peque commented 6 years ago

Agree with everything. I think in Japanese half-size mazes, even if the goal is in an open area, the goal consist of a single cell, in which case we should only mark that cell with a "G". Classic mazes, though, do have 4 goal cells and the goal area may have multiple entrances.

Peque commented 6 years ago

@idt12312 @kerikun11 @ucgosupl @tokoro10g @dangorogoro

Do you use this maze files repository? If so, do you have any comments/ideas?

Would you like to contribute if you find any missing mazes so that we all use the same repository with a canonical text-based format?

micromouseonline commented 6 years ago

The Japanese rules and advance notification of the goal for half size explicitly state that the goal is a region not a cell. They do not guarantee a single entrance.

Peque commented 6 years ago

@micromouseonline Do you have any references? Do they give you multiple goal coordinates in advance then? (one for each cell in the end point area)

I read the opposite in http://www.ntf.or.jp/

http://www.ntf.or.jp/mouse/micromouse2014/kitei_half_since2014.html

The position of the end point is expressed by the coordinates of the entrance of the end point region. (Expression method is shown in Fig. 2)

Also, if you look here for the 32x32 half-size maze, the goal is a single cell (or at least they only mark one in all the end point region):

http://www.ntf.or.jp/mouse/micromouse2014/MM2014recode01.html

But maybe the rules changed since 2014 or those are not the rules to follow. I of course have no practical experience with half-size Japan competitions, so I trust you there. :joy:

micromouseonline commented 6 years ago

The advance notice for 2018 regionals is here: http://www.ntf.or.jp/mouse/micromouse2018/local_meeting.html

And the current rules are here: http://www.ntf.or.jp/mouse/micromouse2018/kitei_micro_since2018.html

Peque commented 6 years ago

@micromouseonline Ahhh I see. They now give you 2 goals, that define the area. In that case yeah, we should mark all the area with "G"s. But maybe for older mazes we should use a single "G", as it seems the rules changed in 2018?

What I mean is that we should not add to the maze file information that the mouse/participant did not know in advance (so before 2018 the mouse only knew a single goal cell, but did not know the size or position of the end-point area).

Do you agree?

micromouseonline commented 6 years ago

Yes, it would, in any case, be too tedious to go back and change them. However, there are not enough half size mazes for it to be a problem. I have some in preparation from Japan2011 on and I will check them before they get added.

Notice also, BTW, that the regional events use a 16x16 half size maze and that the contest names have now changed. Which isn't helping much.

tokoro10g commented 6 years ago

@Peque

Do you use this maze files repository? If so, do you have any comments/ideas? Would you like to contribute if you find any missing mazes so that we all use the same repository with a canonical text-based format?

Yes, but not quite. I think the human-readable text data is suitable for the canonical format. Here is my thought:

It would be useful if the header format is based on key-value store such as JSON or TOML so that the user can easily extend the property for debug purposes. In fact, @idt12312 had adopted JSON format, yet the maze data is not human-readable. https://github.com/idt12312/Mazelib/blob/master/maze_data/classic_alljapan_2011_expert.json

My suggestion is a hybrid of the current and the JSON format, as shown below (using TOML format as a header).

title = "Small Home Test Maze"
author = "Peter"
class = "classic"
width = 8
height = 7
start = [0, 0]
goal = [ [3, 3], [3, 4], [4, 3], [4, 4], [5, 3], [5, 4] ]
memo = """
This format would be nicer.
You can add multi-line comments thanks to the TOML header.
"""

========================================================

o---o---o---o---o---o---o---o---o
|                               |
o   o   o---o---o---o---o---o   o
|   |                           |
o   o---o---o---o---o---o---o   o
|   |         G   G   G |       |
o   o   o---o   o   o   o   o---o
|   |   |   | G   G   G |       |
o   o   o   o---o---o---o---o   o
|   |   |                   |   |
o   o   o---o   o---o---o   o   o
|   |                   |   |   |
o   o   o---o---o---o   o   o   o
| S |   |               |       |
o---o---o---o---o---o---o---o---o

PROS:

CONS:

Any ideas or comments?

Peque commented 6 years ago

@tokoro10g Good points.

I would vote for a 1-liner JSON with the least number of fields included. If you want to split it into multiple lines (i.e.: TOML/YAML), then I would vote for the maze text representation to be included as a field. But not sure @micromouseonline would be happy with any of this. I think he wanted to make parsing easier for low-level languages or home-made libraries.

The changes proposed were about including the width and height, but not the start cell and goal(s). They are all redundant information. So I am not sure why should we include some but not the others. If we include all, the description line is longer and it is harder to parse. Here again, if you already have an advanced parser, you do not really need any of that information. Maybe the "name" or "description" field, to avoid having to write too long file names.

In my case, I am happy with the current format, as I infer dimensions, start cells and goals from the file. I could live with a 1-line header to make things easier for those not having the right tools. I could leave with making the format TOML/YAML compliant (making the text maze representation a field). I could leave with any other solution too, but I think they make less sense.

@micromouseonline Any thoughts? The more I think about it, the more I conclude the best format is the current format. :joy:

tokoro10g commented 6 years ago

I think he wanted to make parsing easier for low-level languages or home-made libraries.

I thought it was the scope of micromouseonline/micromouse_maze_tool. It is easy to manipulate the proposed format to get a simpler text or even binary data.

The changes proposed were about including the width and height, but not the start cell and goal(s). They are all redundant information.

I'm sorry that I led the discussion to the other way. And yes, I know :) I tried to contain as much information as possible to make it an extreme case. I don't really think they are needed. Anyways, my suggestion is probably off-topic, so I will open a new thread if someone needs.

micromouseonline commented 6 years ago

@tokoro10g Thank you for taking the time to look at this and contribute some thoughts. It would be useful to have some more fields available. Mostly, this all shows that everyone has different needs. and there are many ways to get there.

Originally, it had not really occurred to me that anyone might care very much. I assumed that anyone looking for mazes would want it to be as simple as possible. Every time I come across new maze data, it seems to be in yet another format.

So, I think for now, I want to just include the single line with width, height and comment. Mostly this is because I would not want to require users to do add parsers for TOML or jason. Even thought that is generally a case of finding a few files and adding them in to a project.

Many of the proposed additional fields are redundant or potentially error prone. For example, the goal area is actually define by NTF as the corners of a rectangle. I think anyone editing the files with a tool or even by had could add the indicators into the data itself. Any additional information could be incorporated into a single comment. It is not at all clear how often the author field would be either useful or correct and the designation of classic vs half size is extra confusing since NTF changed the contest name. Generally I think users really only care about identifying a specific contest event maze and that is best done though the filename/path.

Adding the maze dimensions is all about making it easier to parse the data and not have to make possibly erroneous inferences from the data itself.

Sorry - that is all a bit rambling really. The long and short of it is, I don't want to add any more complexity than I have to. Really, I would rather not add anything at all but this is the least I think is useful.

Peque commented 6 years ago

To summarize my thoughts on this as well:

Really, I would rather not add anything at all but this is the least I think is useful.

@micromouseonline If you would not, then maybe there is no reason to? It seems there might not be much interest on having the size specified on top. As @tokoro10g mentioned, if someone needs help while parsing, there is already micromouse_maze_tool for that.

micromouseonline commented 6 years ago

@Peque I will think more about this.. There is still a nagging feeling that I need the metadata but it will not crystallise. (There is another shortcoming in the files. That is that the names only tell you what kind of maze it is in the context of the path. So, for example, if you have a program with a list of recently opened files, you have no idea if they were classic or half-size.)

You may notice that the recently added half-size mazes all have names that end in 'hef' to mean 'half-size expert final' - even if there is not an expert class final.

If you are in a script-writing frame of mind :) could I persuade you to knock out something that would place the 'G' character in all four goal cells of the classic mazes? You have way better Python-foo than me. For consistency, I think the entire goal area should be marked up and, for classic mazes it is pre-determined.

micromouseonline commented 6 years ago

@Peque In response to your earlier comment about not marking up the goal area in older maze files.

The goal area for classic mazes was always the four cells in the center, it was never a single cell. Also, the Taiwanese advance notice for half-size always specified the goal area and also stated which cell contained the entrance.

Peque commented 6 years ago

@micromouseonline Sure! I can take care of updating the Gs and the Ss. It should be trivial with Vim macros. :stuck_out_tongue_winking_eye:

I will have a look at it this afternoon and update the tests and README file.

Peque commented 6 years ago

Opened PR marking start and goal cells in classic mazes: https://github.com/micromouseonline/mazefiles/pull/8

Also opened a new issue to discuss how to mark goal cells in half-size mazes: https://github.com/micromouseonline/mazefiles/issues/9