apache / accumulo

Apache Accumulo
https://accumulo.apache.org
Apache License 2.0
1.08k stars 445 forks source link

Added Monitor property to optionally show extent information #5081

Closed dlmarion closed 1 week ago

dlmarion commented 1 week ago

Closes #5078

dlmarion commented 1 week ago

I talked with @ctubbsii regarding this PR and he said that there was discussion about exposing this information in the Monitor a long time ago. He recalled that the reason for not putting it in the Monitor is that there is no user authentication done in the Monitor like there is in the Shell. When a user authenticates to the Shell, then they are allowed to read from the metadata table (they have to for scans, for example). There may be an alternate way to expose this information in the GetSplitsCommand code if it's not already done.

dlmarion commented 1 week ago

I created table test and added splits a-z and 0-9. Running the command getsplits -t <table> -v displays the obfuscated value and the extent information. Piping this command to grep directly did not work, but the following does work:

./accumulo shell -e "getsplits -t test -v -o /tmp/splits" && grep 'X+zrZv/IbzjZUnhsbWlsecLbwjndTpG0ZynXOif7V+k=' /tmp/splits

X+zrZv/IbzjZUnhsbWlsecLbwjndTpG0ZynXOif7V+k= (-inf, 0]
ddanielr commented 1 week ago

I created table test and added splits a-z and 0-9. Running the command getsplits -t <table> -v displays the obfuscated value and the extent information. Piping this command to grep directly did not work, but the following does work:

./accumulo shell -e "getsplits -t test -v -o /tmp/splits" && grep 'X+zrZv/IbzjZUnhsbWlsecLbwjndTpG0ZynXOif7V+k=' /tmp/splits

X+zrZv/IbzjZUnhsbWlsecLbwjndTpG0ZynXOif7V+k= (-inf, 0]

Yeah the whole reason for the property was that there wasn't a way to do a lookup of the obfuscated value to match a tablet.

Piping directly to grep is an issue with the -e behavior in the shell. Looking at past issues, it appears to be related to changes within Jline.

dlmarion commented 1 week ago

FWIW, the following also works:

cat < <(./accumulo shell -e "getsplits -t test -v" 2>/dev/null) | grep -F "X+zrZv/IbzjZUnhsbWlsecLbwjndTpG0ZynXOif7V+k="
dlmarion commented 1 week ago

Yeah the whole reason for the property was that there wasn't a way to do a lookup of the obfuscated value to match a tablet.

@ddanielr - Do you think this is a sufficient solution and I can / should revert my Monitor changes?

ctubbsii commented 1 week ago

Yeah the whole reason for the property was that there wasn't a way to do a lookup of the obfuscated value to match a tablet.

Since there actually is a way to do this lookup, I think we should revert this. I think it is risky to encourage exposing per-tablet info (containing user data) through an unauthenticated monitor service.

Piping directly to grep is an issue with the -e behavior in the shell. Looking at past issues, it appears to be related to changes within Jline.

That error surprised me. I think that is a larger issue, and needs its own ticket... maybe an upstream ticket.

ddanielr commented 1 week ago

Yeah the whole reason for the property was that there wasn't a way to do a lookup of the obfuscated value to match a tablet.

@ddanielr - Do you think this is a sufficient solution and I can / should revert my Monitor changes?

The monitor currently shows the table ID and obfuscated value so a user can take those inputs and provide them to a shell session.

While the information can be obtained from a grep command, I'm assuming that getSplits does not provide the necessary information without the -v flag. So while the functionality is there, it's technically hidden from the user by default.

Since this is a standard action used in troubleshooting, I would be in favor of reverting the change if we added a shell command that takes the tableID and obfuscated value and returns the correct tablet ID.

ddanielr commented 1 week ago

Do we provide tooltips in the monitor? Some sort of link from the tablet row to the specific shell command would also help provide users with a way forward for troubleshooting.

ctubbsii commented 1 week ago

Since this is a standard action used in troubleshooting, I would be in favor of reverting the change if we added a shell command that takes the tableID and obfuscated value and returns the correct tablet ID.

Given the current jline issues causing difficulty in piping the output to grep using -e, having an option to de-obfuscate a single obfuscated value would be useful. getsplits [-t <table>] [-v] -id <obfuscated split id>

Do we provide tooltips in the monitor? Some sort of link from the tablet row to the specific shell command would also help provide users with a way forward for troubleshooting.

We do in some places, and can add more, if needed. It can also be part of the paragraph above the table, or a note in a table caption.

dlmarion commented 1 week ago

This has been reverted and merged up through main.