Open JesusTheHun opened 2 years ago
We do the same thing. See runWithRetry()
This is super weird to not have a way to make sure things are available and ready to be queried. What is even weirder is that even with mutation token, you cannot reach that level of trust in the availability of the document.
So if you batch import documents on behalf of your customer, you message your customer "ok we are ready, you can use the app", and then he logs in and search results are inconsistent. This is super f'ed up. Is there any plan to address that ?
I took a look and found this. https://issues.couchbase.com/browse/MB-50101. It will be in the 7.1.0 server release.
I'm not sure if you can access the issue, but it exposes an "FTS endpoint for knowing the index creation status"
They have exposed this endpoint -
curl -u Administrator:password http://10.144.220.101:8094/api/index/myTestIndex
{"status":"ok","indexDef":{"type":"fulltext-index","name":"myTestIndex","uuid":"7809ff9161466cef","sourceType":"gocbcore","sourceName":"my_bucket","sourceUUID":"6f1c001821506954398a1fab0684a40f","planParams":{"maxPartitionsPerPIndex":1024,"indexPartitions":1},"params":{"doc_config":{"docid_prefix_delim":"","docid_regexp":"","mode":"type_field","type_field":"type"},"mapping":{"analysis":{},"default_analyzer":"standard","default_datetime_parser":"dateTimeOptional","default_field":"_all","default_mapping":{"dynamic":true,"enabled":true},"default_type":"_default","docvalues_dynamic":false,"index_dynamic":true,"store_dynamic":false,"type_field":"_type"},"store":{"indexType":"scorch","segmentVersion":15}},"sourceParams":{}},"planPIndexes":[{"name":"myTestIndex_7809ff9161466cef_4c1c5584","uuid":"188b5a0f27926e97","indexType":"fulltext-index","indexName":"myTestIndex","indexUUID":"7809ff9161466cef","sourceType":"gocbcore","sourceName":"my_bucket","sourceUUID":"6f1c001821506954398a1fab0684a40f","sourcePartitions":"0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853,854,855,856,857,858,859,860,861,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880,881,882,883,884,885,886,887,888,889,890,891,892,893,894,895,896,897,898,899,900,901,902,903,904,905,906,907,908,909,910,911,912,913,914,915,916,917,918,919,920,921,922,923,924,925,926,927,928,929,930,931,932,933,934,935,936,937,938,939,940,941,942,943,944,945,946,947,948,949,950,951,952,953,954,955,956,957,958,959,960,961,962,963,964,965,966,967,968,969,970,971,972,973,974,975,976,977,978,979,980,981,982,983,984,985,986,987,988,989,990,991,992,993,994,995,996,997,998,999,1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012,1013,1014,1015,1016,1017,1018,1019,1020,1021,1022,1023","nodes":{"3fa3ec262f02a80b59ce58e77e781689":{"canRead":true,"canWrite":true,"priority":0}},"indexParams":{"doc_config":{"docid_prefix_delim":"","docid_regexp":"","mode":"type_field","type_field":"type"},"mapping":{"analysis":{},"default_analyzer":"standard","default_datetime_parser":"dateTimeOptional","default_field":"_all","default_mapping":{"dynamic":true,"enabled":true},"default_type":"_default","docvalues_dynamic":false,"index_dynamic":true,"store_dynamic":false,"type_field":"_type"},"store":{"indexType":"scorch","segmentVersion":15}}}],"warnings":[]}
This is related to index creation. My index created and ready before my tests start. It's about a document being ready to be queries.
Use this url
curl -u <username>:<password> http://<ip>:8094/api/nsstats
check for bucket_name:index_name:num_mutations_to_index = 0. In the unit tests, we use okhttp client for things like this.
{
"batch_bytes_added": 13198,
"batch_bytes_removed": 13198,
"curr_batches_blocked_by_herder": 0,
"my_bucket:myTestIndex:avg_grpc_internal_queries_latency": 0,
.
.
.
"my_bucket:myTestIndex:num_files_on_disk": 4,
"my_bucket:myTestIndex:num_mutations_to_index": 0,
The FTS folks pointed me to https://docs.couchbase.com/server/current/fts/fts-search-response.html#at_plus It's not clear to me if the scan_vector is required or not.
@mikereiche yeah the scan_vector is required with at_plus - that's the MutationToken you pass in from the SDK.
so the at_plus with the scan_vector would only work for documents that the client itself had inserted. It could work for the unit test case, but would not work for something like a daily bulk-load of documents.
@mikereiche there is a way to retrieve it for documents, but right now that is not exposed to the user since we only wanted to cover the RYOW use case.
We go a bit beyond the initial scope of this issue but let's dive into it.
@daschl I have a import service that insert all docs with data from a third-party, I would like it to be able to post a message in the broker saying "hey the import is ready, tell the customer". Typical batch size is a few thousands documents.
I could make a dumb request with the at_plus
consistency and once it returns I post my message. This requires the latest token from the last insert ?
@mikereiche how do I get the mutation token of an insert from a spring repository ? We talked about it a few months ago I don't know if you moved forward with that. You talked about a @MutationToken
annotation, like the @Version
one.
fts folks also suggested /api/stats/index/
"This requires the latest token from the last insert ?" I don't believe the order that documents are inserted can be inferred from the order they were inserted. And even if they could, if the inserts are done with the reactive API, there's no way to know the order the inserts actually occurred. One could look at the CAS, but then again, that's another assumption.
As I mentioned earlier, there is the "num_mutations_to_index". That will be zero when all mutations have been indexed. Is that not sufficient?
for Query :
curl -u Administrator:password http://localhost:9102/api/stats/my_bucket/adv_name | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 565 100 565 0 0 39724 0 --:--:-- --:--:-- --:--:-- 40357
{
"my_bucket:adv_name": {
"avg_drain_rate": 0,
"avg_item_size": 0,
"avg_scan_latency": 0,
"cache_hit_percent": 0,
"cache_hits": 0,
"cache_misses": 0,
"data_size": 7296,
"disk_size": 49152,
"frag_percent": 33,
"initial_build_progress": 100,
"items_count": 0,
"last_known_scan_time": 0,
"memory_used": 208,
"num_docs_indexed": 0,
"num_docs_pending": 0,
"num_docs_queued": 0,
"num_items_flushed": 0,
"num_pending_requests": 0,
"num_requests": 0,
"num_rows_returned": 0,
"num_scan_errors": 0,
"num_scan_timeouts": 0,
"recs_in_mem": 0,
"recs_on_disk": 0,
"resident_percent": 0,
"scan_bytes_read": 0,
"total_scan_duration": 0
}
}
While couchbase needs the MutationState/Token returned from a mutation to use as an input to for queries/searches to ensure that indexing is complete - there is no Spring Data api that returns the MutationState/Token - as Spring Data apis are implementation-independent.
When writing tests that involve the FTS engine, I'm stuck with stone age technique of thread cpu-busy loop to wait the FTS engine to index my documents.
I have tried the following :
But it gave me inconsistent results. I guess it waits for the index to acknowledge the document but the document is not yet ready to be searched. Sometimes, on a small sized machine overwhelmed by test suites, a dozen of seconds can separate acknowledgement from readiness.
Is there a common wait to deal with this ?
Edit : in my tests I'm using FTS inside a N1QL query, I don't know if that matters.