Closed XVilka closed 3 years ago
Note, there are not only space separated arguments. There are also macroses, aliases, operators like ~
, |
, >
or >>
, etc. And the need to quote the whole string like:
[0x0000000]> "td struct qwe { int a; char* b; };"
will go to the null device. No more confusion from a newbie users!
This is the way to go! This must be done piecemeal and it may be most desirable to have a nice command completion first (zsh-like).
% r2 -a 6502
-- arch --
6502 -- 6502/NES/C64/Tamagotchi/T-1000 CPU
8051 -- 8051 Intel CPU
arc -- Argonaut RISC Core
arm -- Capstone ARM disassembler
arm.as -- as ARM Assembler (use ARM_AS environment)
arm.gnu -- Acorn RISC Machine CPU
arm.ks -- ARM keystone assembler
arm.winedbg -- WineDBG's ARM disassembler
...
I'd like to do some of the heavy lifting (or most? I need to feel familiar with most commands before next Thursday...) to move const char *help_msg*[]
arrays to toplevel.
Wasnt there already an issue for that? This wasnt possible to be done before because the syntax has changed and right now its pretty stable and consistent. But enforcing this with manual checks is not the way to go obviously.
On 20 Jul 2017, at 07:49, Fangrui Song notifications@github.com wrote:
This is the way to go! This must be done piecemeal and it may be most desirable to have a nice command completion first (zsh-like).
% r2 -a 6502 -- arch -- 6502 -- 6502/NES/C64/Tamagotchi/T-1000 CPU 8051 -- 8051 Intel CPU arc -- Argonaut RISC Core arm -- Capstone ARM disassembler arm.as -- as ARM Assembler (use ARM_AS environment) arm.gnu -- Acorn RISC Machine CPU arm.ks -- ARM keystone assembler arm.winedbg -- WineDBG's ARM disassembler ... I'd like to do some of the heavy lifting (or most? I need to feel familiar with most commands before next Thursday...) to move const char help_msg[] arrays to toplevel.
β You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
So i agree on that, will also speedup rcorecmd()
On 20 Jul 2017, at 07:49, Fangrui Song notifications@github.com wrote:
This is the way to go! This must be done piecemeal and it may be most desirable to have a nice command completion first (zsh-like).
% r2 -a 6502 -- arch -- 6502 -- 6502/NES/C64/Tamagotchi/T-1000 CPU 8051 -- 8051 Intel CPU arc -- Argonaut RISC Core arm -- Capstone ARM disassembler arm.as -- as ARM Assembler (use ARM_AS environment) arm.gnu -- Acorn RISC Machine CPU arm.ks -- ARM keystone assembler arm.winedbg -- WineDBG's ARM disassembler ... I'd like to do some of the heavy lifting (or most? I need to feel familiar with most commands before next Thursday...) to move const char help_msg[] arrays to toplevel.
β You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Why thursday
On 20 Jul 2017, at 07:49, Fangrui Song notifications@github.com wrote:
This is the way to go! This must be done piecemeal and it may be most desirable to have a nice command completion first (zsh-like).
% r2 -a 6502 -- arch -- 6502 -- 6502/NES/C64/Tamagotchi/T-1000 CPU 8051 -- 8051 Intel CPU arc -- Argonaut RISC Core arm -- Capstone ARM disassembler arm.as -- as ARM Assembler (use ARM_AS environment) arm.gnu -- Acorn RISC Machine CPU arm.ks -- ARM keystone assembler arm.winedbg -- WineDBG's ARM disassembler ... I'd like to do some of the heavy lifting (or most? I need to feel familiar with most commands before next Thursday...) to move const char help_msg[] arrays to toplevel.
β You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Cool! Glad you agree with moving help_msg*
arrays to toplevel.
@radare DEFCON ... And I should learn enough to use r2 in the CTF and evangelize it to my teammates...
I am not sure if moving help messages to top level without considering the way commands will be structured helps too much. Help messages could work by looking up a command, printing associated help message and iterating through direct sub-commands and printing their short messages. Such approach would ensure that there is no command which is not listed in help and allow automatic description generation for builtin aliases like Sj=iSj.
Here an approximate idea of what command description could contain:
Having multiple action types would permit leaving whole command subtrees using the old command description and converting them incrementally.
Negative argument offset would mean that command function receives command name as part of argument this would provide better compatibility with existing command functions during transition. Positive offset would allow automatically skipping space between command name and argument thus simplifying code in command itself.
I remember seeing some commands that changed behavior depending on type and number of arguments, those might be tricky to deal with.
@karliss Thanks for your enlightening draft of command description! Do you know how to make help message available for lookup in a centralized registry? I go the toplevel way because __attribute__((constructor))
magic can be used to register the help message.
@MaskRay if you're going to defcon, make sure you ping there π
DoNt use constructors for this please :p
On 20 Jul 2017, at 09:28, Fangrui Song notifications@github.com wrote:
@karliss Thanks for your enlightening draft of command description! Do you know how to make help message available for lookup in a centralized registry? I go the toplevel way because attribute((constructor)) magic can be used to register the help message.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
i would love to do this refactoring.
my idea for this is to implement a function that parses the command line and fills a struct with all the info required to run everything:
On 20 Jul 2017, at 10:12, Giovanni notifications@github.com wrote:
i would love to do this refactoring.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/7967#issuecomment-316630334, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3-lhCKUXfMEUAZ3_0-Z47BOyVkXBS4ks5sPwvQgaJpZM4Odl9L.
something like gnu opt
getopt?
On 20 Jul 2017, at 14:22, Giovanni notifications@github.com wrote:
something like gnu opt
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/7967#issuecomment-316686448, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3-lgSpi8j2978FkwFY0xUTlKo-pfwhks5sP0ZtgaJpZM4Odl9L.
Since there is mpc
in the radare2 sourcecode anyway (libr/egg/rlcc/mpc), may be it makes sense to use for parsing command line as well?
yes, getopt
. for mpc, i've never heard of it.
getopt parses command arguments, not syntactic stuff, the mpc thing was suposed to parse rlang or C, not sure if that will work for the kind of syntax we have in r2 commands.
also, this refactoring should be done in parallel as an option at runtime, to choose between different prompts otherwise we will have a lot of regressions
On 20 Jul 2017, at 14:41, Giovanni notifications@github.com wrote:
yes, getopt. for mpc, i've never heard of it.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/7967#issuecomment-316690748, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3-lkVq0BpIIAaH4GLCF5AfFhh-eBD3ks5sP0sOgaJpZM4Odl9L.
radare said in the channel that other issues are of higher priority. This refactoring should be implemented in a separate loop that does not break current code.
What about having a boolean exp.cmd
variable to enable the refactored command parser?
Users can invoke r2 -e exp.cmd=1
to enable the feature.
Moving help messages (https://github.com/radare/radare2/pull/7969 ) can serve as a first step to the refactoring and the effect will be instant: a friendlier command completion system with help messages from const char *help_msg[]
arrays scattered all over the code. The case '?'
branches in the code will refer to const char *help_msg_ae[]
.
You may have concerns about having help messages not around the code, but can you shed light on an alternative to have a centralized help message?
i would say cmd.parser or cfg.newparser or something like that
thereβs no exp. prefix yet and i understand its experimental, but everytihng in r2land is experimental :P so thereβs no need for that namespace imho
On 21 Jul 2017, at 03:21, Fangrui Song notifications@github.com wrote:
radare said in the channel that other issues are of higher priority. This refactoring should be implemented in a separate loop that does not break current code.
What about having a boolean exp.cmd variable to enable the refactored command parser? Users can invoke r2 -e exp.cmd=1 to enable the feature.
Moving help messages (#7969 https://github.com/radare/radare2/pull/7969 ) can serve as a first step to the refactoring and the effect will be instant: a friendlier command completion system with help messages from const char *help_msg[] arrays scattered all over the code.
You may have concerns about having help messages not around the code
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/7967#issuecomment-316874957, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3-lhGVbiifwHz-NxROjrT5B-tzx15uks5sP_0fgaJpZM4Odl9L.
Before:
case 'u': // "?u"
{
char unit[32];
n = r_num_math (core->num, input+1);
r_num_units (unit, n);
r_cons_println (unit);
}
break;
After:
#include "r_args.h"
// #define ARG_STR(i) ctx.args[i].str_val
// #define ARG_INT(i) ctx.args[i].str_val
// #define ARG_ADDR(i) ctx.args[i].addr_val
// #define ARG_STR_DEF(i, def) (i < ctx.nargs ? ctx.args[i].str_val : def)
// #define ARG_INT_DEF(i, def) (i < ctx.nargs ? ctx.args[i].int_val : def)
// ...
int cmd_help(...) {
struct Context {
int nargs;
union Arg {const char* str_val; int int_val; ut64 addr_val;} args[9];
} ctx;
switch (input[0]) {
...
case 'u': { // "?u"
R_ARGS("s|iA"); // expands to if (parse_fail) {print_error; break;} else fill in ctx.args
// 0th arg as const char*, 1st arg as int, 2nd arg as ut64
r_cons_printf (ARG_STR(0), ARG_INT(1), ARG_ADDR(2));
break;
}
}
The final goal is to move R_ARGS
to the command descriptor function so that arguments can be completed. But before that, these R_ARGS
will be kept under case labels.
If I understood your example correctly, you changed the syntax for the help command from u?
to ?u
, is that right?
From what I understand your code wouldn't implement any sort of AST?
Wat
On 6 Mar 2018, at 10:11, xarkes notifications@github.com wrote:
If I understood your example correctly, you changed the syntax for the help command from u? to ?u, is that right? From what I understand your code wouldn't implement any sort of AST?
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
If I understood your example correctly, you changed the syntax for the help command from u? to ?u, is that right?
No. I just gave an example that argument parsing should be done properly, not strchr strstr r_num_math here and there.
Yep ok good.
On Sun, Mar 11, 2018, 21:03 Fangrui Song notifications@github.com wrote:
If I understood your example correctly, you changed the syntax for the help command from u? to ?u, is that right?
No. I just gave an example that argument parsing should be done properly, not strchr strstr r_num_math here and there.
β You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/7967#issuecomment-372144478, or mute the thread https://github.com/notifications/unsubscribe-auth/AKRgfdHP-TzcEl5cMp5gNVDHqgV66sVoks5tdYMTgaJpZM4Odl9L .
Note that it would be perfect if each command has a public API.
On Cutter we mostly use r_core_cmd
and R_API functions. It appears that the R_API are less reliable than the commands themselves (which sucks).
There is now an autocomplete feature https://github.com/radare/radare2/blob/master/libr/core/core.c#L1923
static void init_autocomplete (RCore* core) {
core->autocomplete = R_NEW0 (RCoreAutocomplete);
/* flags */
r_core_autocomplete_add (core->autocomplete, "s", R_CORE_AUTOCMPLT_FLAG, true);
r_core_autocomplete_add (core->autocomplete, "s+", R_CORE_AUTOCMPLT_FLAG, true);
r_core_autocomplete_add (core->autocomplete, "b", R_CORE_AUTOCMPLT_FLAG, true);
r_core_autocomplete_add (core->autocomplete, "f", R_CORE_AUTOCMPLT_FLAG, true);
cmd_*.c
files@wargio
We can start with defining grammar of radare2 commands with MPC: https://github.com/orangeduck/mpc#language-approach @thestr4ng3r already did a good job with porting of some of the C parsing on top of MPC, so it can be used as a starting point: https://github.com/radare/radare2/blob/master/libr/parse/ctype.c
After this is done we can improve
I will offer 0.1 BTC to the one who will fix this issue. Initially, I wanted to use BountySource, but its support for bitcoins is a plain nightmare. So I will offer it directly.
Bounty is up to 0.2 BTC now
Scrap the MPC idea, Tree-Sitter is a better approach, would allow parsing as typed
neovim is integrating tree-sitter, but I think it may be an overkill for radare2 to integrate it. A simple top-down operator precedence parsing is probably all that is needed. The complicated thing is of course the unclear semantics and expectation of the current syntax. To keep it fully backward compatible is probably unrealistic (the long tail).
We are also considering to use tree-sitter for (partially) parsing C (headers, types), so having it in r2 can make sense.
Agreed about backwards compatibility. It should be compatible where it makes sense, but it doesn't have to be in every way.
For example I don't like the "?E clippy@rada.re"
syntax. I would prefer ?E "clippy@rada.re"
like any other shell does.
Bounty is up, 0.3 BTC now.
About backward compatibility - yes, it should be mostly compatible, with fixes to the tests/other commands where this is unrealistic.
I'm going to use this issue to discuss problems I'm encountering while implementing the new parser in tree-sitter. Right now, you can do things like pd 10 +2
or pd 5 @ main +2
, which I think is not good because it is not clear how to split arguments.
How do you know pd
should be called as pd(12)
and not as pd(10,+,2)
?
Or, to make the example clearer, let's look at om
, which accepts more than one argument om fd vaddr [size] [paddr] [rwx] [name]
. If math expressions can have spaces, how do you distinguish arguments? How do you interpret om 3 4 +10 + 20000
? Is it om(3, 4, +10, 20000)
? Or is it om(3, 14, 20000)
? om(3, 20014)
? Or something else?
For the reasons above, I think math expressions should either be without spaces or wrapped between ()
to make it clear it's just one "argument". WDYT?
I think yes, allowing spaces only in the wrapped math expressions is the way to go.
On Sat, Feb 15, 2020, 00:37 Riccardo Schirone notifications@github.com wrote:
I'm going to use this issue to discuss problems I'm encountering while implementing the new parser in tree-sitter. Right now, you can do things like pd 10 +2 or pd 5 @ main +2, which I think is not good because it is not clear how to split arguments.
How do you know pd should be called as pd(12) and not as pd(10,+,2)?
Or, to make the example clearer, let's look at om, which accepts more than one argument om fd vaddr [size] [paddr] [rwx] [name]. If math expressions can have spaces, how do you distinguish arguments? How do you interpret om 3 4 +10 + 20000? Is it om(3, 4, +10, 20000)? Or is it om(3, 14, 20000)? om(3, 20014)? Or something else?
For the reasons above, I think math expressions should either be without spaces or wrapped between () to make it clear it's just one "argument". WDYT?
β You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/radareorg/radare2/issues/7967?email_source=notifications&email_token=AABRT7JMNSTDY7E5Z6N2Z4DRC3CFLA5CNFSM4DTWL5F2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELZT7TQ#issuecomment-586366926, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABRT7LIQYBW4OGGJ5SABJDRC3CFLANCNFSM4DTWL5FQ .
Update: basic command structure can now be parsed with the new tree-sitter parser, enabled with cfg.newshell
.
One more example on supporting highlighting with tree-sitter (for OCaml language): https://github.com/tree-sitter/tree-sitter-ocaml/pull/42 cc @ret2libc
We can close this issue. For the curious ones:
Rather than allow each command handler to parse itself, there is a need for a proper common parser of the command line, providing just a set of tokens to the handler function. This will allow to improve the syntax parser, improve commands help and syntax error reporting, easier and niftier autocompletion, may be even syntax highlight of the command line, right when you type. And many more things.