feature request: Add ability for extra fields(i.e. problem id)

dayo05 commented 5 months ago

Lots of plugins including AutoCP supports file creation based on its name or some contents.

For Instance, Example on README file name will be Castle_Defense.cpp or similar.

Some OJ uses problem ID for problem identification. Particularly, Beakjoon Online Judge uses 4~5 digit number for identification. Codeforces has [ContentNumber][ProblemId], like 954G.

All of OJ has different format for it. So I suggest to give ability of custom, optional field which can be specific to OJ.(Like problem number in Beakjoon Online Judge)

touhidurrr commented 5 months ago

Good request I guess. An optional problem id field looks like a good idea.

dayo05 commented 5 months ago

Or to separate the group property can also achieve similar goal.

jmerle commented 5 months ago

I've got three thoughts about this:

I do not want to make backwards incompatible changes to the output format anymore. Separating the group property is not going to happen.
I do not want to add custom fields to the output format that only exist for a subset of judges. The goal of Competitive Companion is to map the content of dozens of online judges to a generic data format that is easy to use in external tools, and in my opinion custom per-judge fields do not fit that vision.
Requests related to having problem ids in the output data or file names are somewhat common (#163 #181 #280 #418 #462). In the past I've always taken the stance that external tools should do such conversions on their own (in most cases you can extract the problem id from the problem url), but now that I'm thinking about it again and given that most judges have publicly-exposed problem ids or shorthands like the "954G" you mentioned, I'm not completely opposed to it anymore.

Thinking about it, if I were to implement this, I think the best way would be to add an optional, non-nullable "id" field to the output format containing the short name. Optional for backwards compatibility, and non-nullable because in case an online judge does not have the concept of ids (or does not expose them publicly), we can make the field default to the problem's name.

Thinking about potential issues, without having concrete examples at hand, I think the following could create issues:

Problem ids like "CON" could create issues on Windows, where CON is a reserved filename.
Problem ids containing forward slashes, backward slashes, or colons might exist and cause issues as these characters are directory separators on some operating systems.
Problem ids may not necessarily be ASCII-only, which could lead to issues like #431.

Most of these potential issues are OS-specific and/or tool-specific, so I think it's best to give no guarantees about the id field other than it being a non-empty string if the field exists. Any post-processing to make the id fit an external tool's purpose would be left to the external tool to deal with.

As a major downside, this requires updating all parsers to support this new field (or at least all parsers I consider "somewhat popular", which are at least a few dozen of them). On the other hand, I've been planning to go through all parsers to make sure they're consistent and up-to-date for a while now, and it would be easiest to add this feature during that process.

I will need some more time to think about this. How do you feel about the above?

touhidurrr commented 5 months ago

maybe open a new id branch and start working on this in your free time. You can also make a issue that lists the judges that you want to update. Then I can go through the list and try to add id supports one by one for as many judges as i can also. no rush, another branch and we can do this slowly. and about reviewing all existing parsers, i think that would be a you only thing. i dont think anyone else knows what certain blocks of codes in certain parsers fix certain stuff more than you. that's all i can say with the intention of helping as much as i can.

jmerle commented 5 months ago

maybe open a new id branch and start working on this in your free time. You can also make a issue that lists the judges that you want to update. Then I can go through the list and try to add id supports one by one for as many judges as i can also. no rush, another branch and we can do this slowly. and about reviewing all existing parsers, i think that would be a you only thing. i dont think anyone else knows what certain blocks of codes in certain parsers fix certain stuff more than you. that's all i can say with the intention of helping as much as i can.

Even though I appreciate the offer, I'm planning to do this round of parser maintenance (including potential additions like parsing ids) on my own.

royqh1979 commented 1 month ago

OJs need id to identify each problem. But why competitive companion clients need it?

It seems that problem's url is enough for clients to identify them.

dayo05 commented 1 month ago

OJs need id to identify each problem. But why competitive companion clients need it?

It seems that problem's url is enough for clients to identify them.

Most of case, problem id can retrieve by url. If client want to use problem id, then client can parse it. But problem id used frequently for naming file, for easy identification. Using problem name(i.e. AutoCP uses this way) can lead issue like invalid characters especially problem name is not valid ASCII character(like Korean, Chinese, etc.). Also, using problem id is better for sharing the problem to others. Problem name sometimes too long or it can has duplicated name.

For these reason, IMO its enough to include optional problem-id field to core code because lots of client wants to convert url into problem id if only url was given.

touhidurrr commented 1 month ago

Using problem name(i.e. AutoCP uses this way) can lead issue like invalid characters especially problem name is not valid ASCII character(like Korean, Chinese, etc.).

I also agree with this. I personally had the same issue for a judge with Bangla problems. If this extension provided an ID, other software's might prioritize ID over Name for generating file names. And this can potentially solve many edge cases.

jmerle / competitive-companion

feature request: Add ability for extra fields(i.e. problem id) #466