Open npawar opened 4 years ago
@npawar What's the difference between this API and remove all segments API? I can see the truncate API also disables the table, but I don't think it should do so.
@Jackie-Jiang there's no difference. It's just packaging the Pinot operations into a command that users are familiar with (database terminology).
Yes, the table should not be permanently disabled. I noted disable
so that we stop sending queries. I forgot to write re-enable after deleting all segments.
@npawar IMO we don't need to disable then re-enable the table. If the user sends queries while truncating the table, the result is expected to be indeterministic. If we disable the table, no Helix transition message will be sent, and all the transitions will happen when the table gets re-enabled. The behavior will be the same as not disabling the table.
Actually I'm not sure about the Helix behavior when you modify the IdealState of a disabled resource
We may need to do this carefully. Need to consider race conditions like:
Due to these reasons I prefer that we don't complicate pinot admin APIs on the controller. Instead, provide an admin command that can do more complex operations using the primitives in there, and taking care of other conditions and correcting for them.
@mcvsubbu Pinot admin commands are simply wrappers on pinot controller admin API's right.
We can disable a table first and then truncate it. This is a useful command and let's not worry too much about all the corner cases. when someone is doing a truncate, they don't really care about the data.
@kishoreg Maybe you forget, :) but I know how hard it was to detect (let alone debug) the phenomena of two consuming segments for the same partition in the same replica.
Yes, this is a corner case, but when an issue is reported, we need to debug it. If it was hard when we had all the machines and logs at hand and could look at zookpeer logs at will, imagine doing it remotely when some open source user reports a problem when we can't even GUESS that there are parallel consuming segments.
We added plenty of comments and put the logic in one class so that we don't stumble on such things again.
We make a strong assumption here that there are only two threads in the entire cluster that create segments. By starting a new validation manager, we violate that assumption.
whats the fix?
whats the fix?
Good question. I don't have an answer at the moment. As a reviewer, the best I can do (and sometimes it is not possible either) to look at problems it may bring about.
Keeping the truncate facility as a command line argument will help, because we will NOT be running validation manager, etc. We can check things all we want, report error to the user, user can run it again, etc. Why can't we do that ?
Providing primitive APIs, and command lines on top of that is a good practice for most things, especially this one.
whats the fix?
Good question. I don't have an answer at the moment. As a reviewer, the best I can do (and sometimes it is not possible either) to look at problems it may bring about.
Keeping the truncate facility as a command line argument will help, because we will NOT be running validation manager, etc. We can check things all we want, report error to the user, user can run it again, etc. Why can't we do that ?
Providing primitive APIs, and command lines on top of that is a good practice for most things, especially this one.
One way to alleviate such a problem will be to scan the ideal state for duplicate segments in the same partition/sequence pair. If there is an attempt to add this, then throw a permanent exception and not allow addition of the segment. This will, at least help us a bit in debugging, if we get controller logs and see this exception.
This self-checking code will appear to be a patch against a bug that we don't know about. I don't like it, but this is an alternative if we are having multiple threads (across controller hosts) update the idealstate like it is being suggested here. It is also not clear what the ideal state updater should do when this exception is encountered. Ideally, I would like some corrective action taken instead of just moving on, but we can think about it. At the minimum, set a metric and we can alert on the metric.
Another solution is to approach this the right (but complex) way of going through a znode that allows exactly one thread on any one machine to update the idealstate of a table. Then we can freely start validation manager whenever we feel like it, on whichever controller receives some complex request like this.
But I go back to my original question. Why provide more than primitives as API, and let the administrative comands (or GUI or whatever) do multiple operations to achieve what the end user wants? Advantages:
How is this truncate different from drop table.
my proposal is
We already do the first 3 steps for delete/drop table.
@kishoreg what happens if the controller fails over to another one in between any of these steps?
Client gets an error response and tries it again. All the steps are idempotent.
This is same as any other call that involves multiple steps right?
For e.g addTable can fail before setting up real-time table consumption.
@kishoreg Step 4 - Enable the table
Done
Step 5- call the setup table method
Can you give some idea or share some resource about this step?
@npawar can you point him to setup table code or the validation manager.
@Sreemanth take a look at PinotHelixResourceManager#addTable. In that method, if you trace path for realtime, you will see the call to ensureRealtimeClusterIsSetUp. That call will setup the realtime table again. Please ping me if you have any questions
@npawar Ensure table setup logic added.
Please add me to the reviewer list when u have a PR.
And if possible, do not submit one huge PR. It will enable timely (and better) reviews.
As i see it, there are multiple parts in this that can go independently. You can add these before exposing the final API. But then definitely get the API started in a design document so that we can comment on the API. It is a good practice not to debate on API in PRs.
If it is not a design doc, we can debate on the API here in the issue.
Hi folks! Are there any news on this feature? I'm starting to use Pinot with a system I develop, based on Python, and it's been quite a struggle for me to be able to work on integration tests because I haven't yet found an easy and reliable way to start each test from a clean slate. Having a "truncate table" functionality would be immensely useful for my project!
Add truncate operation i.e. delete all contents of table, without removing table config and schema.
For offline, this simply means deleting all segments (everything a delete operation does, but keep the table/schema definition) For realtime, this means deleting all segments, both completed and consuming. Then, we should recreate the ideal state, just like we would've for a new realtime table.