shikhar / eskka

elasticsearch discovery plugin using akka cluster
Other
56 stars 3 forks source link

jepsen testing #6

Open shikhar opened 10 years ago

shikhar commented 10 years ago

@AeroNotix I don't expect to be able to get around to this until the next weekend, am heading on vacation. Do you want to take a crack at it? Or I can ping you with questions once I'm back.

shikhar commented 10 years ago

made some progress on getting jepsen's elasticsearch test going (and producing a split-brain)

ty @abailly for the vagrantization, really helps! i had some errors due to which i ended up running the setup scripts manually on the vagrant VM, i'll try to reproduce and report back

abailly commented 10 years ago

My pleasure! If I can help, do not hesitate to ping me back.

AeroNotix commented 10 years ago

From what I gather, you should just be able to modify the setup method of the Elasticsearch DB implementation to install your plugin, no?

You shouldn't need to modify the tests or anything (in fact, I would argue you should definitely not modify the tests.)

shikhar commented 10 years ago

yes, that's the approach I am going with

current diff https://gist.github.com/shikhar/84945b975432bdf385ef

shikhar commented 10 years ago

using eskka:

{:valid? true,
 :html {:valid? true},
 :set
 {:valid? true,
  :lost "#{}",
  :recovered
  "#{267..268 270..273 275 278..279 286..287 289..292 294..296 299 301..302 306..308 310 312..314 317 319..320 323..325 327..329 333..335 338..340 343..344 349..352 355..357 359..361 364 366..368 370..372 379 382 385 388..390 393..394 396 398..399 401 404..405 407 409..410 412 414..416 419 422 424 426 428 430 432..433 435..438 440 443..444 449..450 452..455 457 460 462 465..466 468..470 475..476 479 482 484..486 488..495 498..500 502..504 506 508..509 511 513..516 518..519 522..524 526 531 533..534 536 538 541 543..544 549 552 554..560 564..565 569 571 573 577..578 580 582..586 588..589 591 598 602..603 605..610 614 616 619..620 622 625..626 629..630 635..638 640..643 645 647 651..652 655..656 658..661 664..666 668..670 672 675..677 680..683 685 688..690 692..695 697..700 702 704..705 707..710 712..715 717 719 724..727 729..732 735..737 739 741..742 744 746..747 749..752 754..757 759..760}",
  :ok
  "#{0..273 275 277..279 281 284 286..287 289..292 294..296 299..302 306..308 310..314 316..320 323..325 327..329 333..335 338..340 343..344 348..352 355..357 359..362 364..368 370..372 375..377 379 382 384..385 388..390 393..396 398..401 404..407 409..410 412 414..416 419 422 424..426 428 430 432..433 435..438 440..444 449..450 452..455 457..460 462 465..466 468..472 475..477 479 481..486 488..495 498..500 502..506 508..509 511 513..516 518..524 526 531 533..534 536 538 541 543..545 549 552..566 568..569 571 573 576..592 594..598 600 602..603 605..610 614..616 619..622 625..627 629..630 634..638 640..643 645..647 651..656 658..661 664..666 668..672 675..677 679..683 685..690 692..695 697..700 702 704..705 707..710 712..715 717..719 721 723..727 729..732 735..737 739 741..742 744 746..747 749..752 754..757 759..760}",
  :recovered-frac 282/761,
  :unexpected-frac 0,
  :unexpected "#{}",
  :lost-frac 0,
  :ok-frac 612/761}}

Ran 1 tests containing 1 assertions.
0 failures, 0 errors.

using es 1.2.1 zen:

FAIL in (create-test) (elasticsearch_test.clj:84)
expected: (:valid? (:results test))
  actual: false
{:valid? false,
 :html {:valid? true},
 :set
 {:valid? false,
  :lost
  "#{513..514 516 518..527 529 532..533 535 538 541 543..547 549 551 553 555..556 558 561 563 565 567..569 571..577 579..580 582..584 586..590 593..599 601..603 605 607..608 612 614 629 633..634 637 644 647 652 655 660 666..667 670 677 686 688 698 700 703 705 708 710 718 720 725 729 735 738 740 745 747 760 763 765 770 775 778 780 783 785 788 793 795 798 805..806 820 825 830 832 838 841 852 857 862 867 873 876..877 881 883 886..887 891 898 901 906 911 913 915 918 922..923 927 932..933 937..938 943 953 958 962..963 967 972 977 982..983 987 992 1002 1007 1024 1032 1034 1037 1047 1049 1054 1062 1069 1072 1074 1082 1087 1092..1093 1098 1114 1123..1124 1128..1129 1139 1144 1148..1149 1159 1163 1167 1169 1172 1197 1199 1202 1204 1209 1214 1217 1222 1224 1227 1232 1242 1244 1249 1252 1254 1258 1264 1268 1273..1274 1279 1284 1287 1289 1292 1309 1312 1319 1322 1328 1337 1343 1347 1353 1357..1358 1367 1372..1373 1377..1378 1382..1383 1388 1392..1393 1402..1403 1407 1411 1416..1417 1421 1426 1428 1436 1447 1453 1456..1457 1461 1463 1468 1471 1477 1483 1488 1491 1496..1497 1501 1512 1514 1519 1521 1527 1531 1534 1539 1541 1546 1550 1556 1566 1571 1574 1576 1586 1590 1596 1606 1611 1620..1621 1625 1630 1635 1644 1647 1649 1657 1659 1664 1667 1669 1678 1683 1689 1693..1694 1698..1699 1703..1704 1710 1713 1718 1720 1723 1736 1738 1747 1752..1753 1757 1761 1766 1771 1773 1776 1781 1791 1796 1801 1803 1805 1811 1818 1836 1838 1843 1846 1848 1855 1866 1873 1876 1878 1885 1888 1894..1895 1900 1905 1907 1910 1916..1917 1921 1926..1927 1931..1932 1936..1937 1941 1946 1951}",
  :recovered
  "#{266 268..269 273 284 293 313 315 318 321 324 329 334 337 347..349 352 354 359 371 374 381..382 389 392 398..399 403 406 409 419 427 439 441 447..448 463 465 474 477..478 482 484 490..491 497 501..502 505 512 515 536 539 559 566 581 585}",
  :ok
  "#{0..266 268..273 276 279..280 282..284 288 293..295 297..299 301..302 306 309..313 315 318 321..322 324 329 333..334 337 340 343..344 347..349 352 354 359..360 362..363 365..372 374..375 381..382 384 386..387 389..392 394 397..399 403..404 406 408..409 412 414 416..419 424..427 429 432 435 439..442 444 447..448 450 452..454 456 458..461 463 465 467..468 470 472..474 477..479 481..484 486..487 489..491 494 496..502 505..512 515 517 528 530..531 534 536..537 539..540 542 548 550 552 554 557 559..560 562 564 566 570 578 581 585 591..592 600 604 606 609..611 613 615..628 630..632 635..636 638..643 645..646 648..651 653..654 656..659 661..665 668..669 671..676 678..685 687 689..697 699 701..702 704 706..707 709 711..717 719 721..724 726..728 730..734 736..737 739 741..744 746 748..759 761..762 764 766..769 771..774 776..777 779 781..782 784 786..787 789..792 794 796..797 799..804 807..819 821..824 826..829 831 833..837 839..840 842..851 853..856 858..861 863..866 868..872 874..875 878..880 882 884..885 888..890 892..897 899..900 902..905 907..910 912 914 916..917 919..921 924..926 928..931 934..936 939..942 944..952 954..957 959..961 964..966 968..971 973..976 978..981 984..986 988..991 993..1001 1003..1006 1008..1023 1025..1031 1033 1035..1036 1038..1046 1048 1050..1053 1055..1061 1063..1068 1070..1071 1073 1075..1081 1083..1086 1088..1091 1094..1097 1099..1113 1115..1122 1125..1127 1130..1138 1140..1143 1145..1147 1150..1158 1160..1162 1164..1166 1168 1170..1171 1173..1196 1198 1200..1201 1203 1205..1208 1210..1213 1215..1216 1218..1221 1223 1225..1226 1228..1231 1233..1241 1243 1245..1248 1250..1251 1253 1255..1257 1259..1263 1265..1267 1269..1272 1275..1278 1280..1283 1285..1286 1288 1290..1291 1293..1308 1310..1311 1313..1318 1320..1321 1323..1327 1329..1336 1338..1342 1344..1346 1348..1352 1354..1356 1359..1366 1368..1371 1374..1376 1379..1381 1384..1387 1389..1391 1394..1401 1404..1406 1408..1410 1412..1415 1418..1420 1422..1425 1427 1429..1435 1437..1446 1448..1452 1454..1455 1458..1460 1462 1464..1467 1469..1470 1472..1476 1478..1482 1484..1487 1489..1490 1492..1495 1498..1500 1502..1511 1513 1515..1518 1520 1522..1526 1528..1530 1532..1533 1535..1538 1540 1542..1545 1547..1549 1551..1555 1557..1565 1567..1570 1572..1573 1575 1577..1585 1587..1589 1591..1595 1597..1605 1607..1610 1612..1619 1622..1624 1626..1629 1631..1634 1636..1643 1645..1646 1648 1650..1656 1658 1660..1663 1665..1666 1668 1670..1677 1679..1682 1684..1688 1690..1692 1695..1697 1700..1702 1705..1709 1711..1712 1714..1717 1719 1721..1722 1724..1735 1737 1739..1746 1748..1751 1754..1756 1758..1760 1762..1765 1767..1770 1772 1774..1775 1777..1780 1782..1790 1792..1795 1797..1800 1802 1804 1806..1810 1812..1817 1819..1835 1837 1839..1842 1844..1845 1847 1849..1854 1856..1865 1867..1872 1874..1875 1877 1879..1884 1886..1887 1889..1893 1896..1899 1901..1904 1906 1908..1909 1911..1915 1918..1920 1922..1925 1928..1930 1933..1935 1938..1940 1942..1945 1947..1950}",
  :recovered-frac 29/976,
  :unexpected-frac 0,
  :unexpected "#{}",
  :lost-frac 179/976,
  :ok-frac 1489/1952}}

Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
Tests failed.
AeroNotix commented 10 years ago

Do you have your jepsen test code somewhere?

shikhar commented 10 years ago

@AeroNotix just pushed it here https://github.com/shikhar/jepsen/tree/eskka

you run it the same way as the ES tests since I updated elasticsearch_test.clj & elasticsearch.clj in place

shikhar commented 10 years ago

In the interest of disclosure, seen some recent failures with a small number of writes lost using various partitioner strategies specified in nemesis.clj. Digging into them.

shikhar commented 10 years ago

As I mentioned here, I think the current failures are related to other bugs in ES that are hopefully going to be tackled in https://github.com/elasticsearch/elasticsearch/tree/feature/improve_zen.

Here are the latest results from jepsen runs for eskka-0.5.0-SNAPSHOT vs zen (both with ES 1.2.1).

self-primaries-nemesis eskka - https://gist.github.com/728c3c3023a0b32c8eb6 - 0 writes lost zen - https://gist.github.com/ef65fed8fbad72d892e8 - 1 write lost

partition-random-node this nemesis was crashing for some reason https://gist.github.com/shikhar/6ba745c6021c21d8571b

partition-halves eskka - https://gist.github.com/d6c00eb30e471c0bc452 - 2 writes lost zen https://gist.github.com/159c6610957f3adea20f - 5 writes lost -- had to restart for the cluster to converge at end of test

partition-random-halves eskka - https://gist.github.com/c0324584daf1ebb757ed - 3 writes lost zen - https://gist.github.com/8dd773dd9c1e88aab957 - 8 writes lost -- had to restart for the cluster to converge at end of test

partition-nemesis-bridge eskka - https://gist.github.com/911202ac0e6090b50d4a - 6 writes lost zen - https://gist.github.com/1282e9c0033993775120 - 57 writes lost -- had to restart for the cluster to converge at end of test

p.s. I've merged in @aphyr's latest changes in https://github.com/shikhar/jepsen/tree/eskka

AeroNotix commented 10 years ago

@shikhar FWIW you don't need to have a fork of Jepsen to do this. All of the stuff is available as a library as I have done in https://github.com/AeroNotix/jtpg (not sure if this is easier for you or not, just thought you might like to know)

AeroNotix commented 10 years ago

@shikhar your "As I mentioned here" link is 404, I think you meant to link to: http://aphyr.com/posts/317-call-me-maybe-elasticsearch

shikhar commented 10 years ago

Thanks @AeroNotix, fixed!

Re: using Jepsen as a library, that's very cool and makes more sense than a fork. I am going to port over to that pattern when I get the chance.

AeroNotix commented 10 years ago

@shikhar SGTM!