CAS

源码

Unsafe.class 中声明 compareAndSwap 的 native 方法：

public final native boolean compareAndSwapObject(Object var1, long var2, Object var4, Object var5);

public final native boolean compareAndSwapInt(Object var1, long var2, int var4, int var5);

public final native boolean compareAndSwapLong(Object var1, long var2, long var4, long var6);

compareAndSwap native 方法的实现： unsafe.cpp L1607 & unsafe.cpp L1185

static JNINativeMethod methods_18[] = {
    {CC"compareAndSwapObject", CC"("OBJ"J"OBJ""OBJ")Z",  FN_PTR(Unsafe_CompareAndSwapObject)},
    {CC"compareAndSwapInt",  CC"("OBJ"J""I""I"")Z",      FN_PTR(Unsafe_CompareAndSwapInt)},
    {CC"compareAndSwapLong", CC"("OBJ"J""J""J"")Z",      FN_PTR(Unsafe_CompareAndSwapLong)},
}

UNSAFE_ENTRY(jboolean, Unsafe_CompareAndSwapInt(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, jint e, jint x))
  UnsafeWrapper("Unsafe_CompareAndSwapInt");
  oop p = JNIHandles::resolve(obj);
  jint* addr = (jint *) index_oop_from_field_offset_long(p, offset);
  return (jint)(Atomic::cmpxchg(x, addr, e)) == e;
UNSAFE_END

最终调用了 Atomic::cmpxchg(x, addr, e) 方法

linux_x86 的实现： atomic_linux_x86.inline.hpp L93

inline jint     Atomic::cmpxchg    (jint     exchange_value, volatile jint*     dest, jint     compare_value) {
  int mp = os::is_MP();
  __asm__ volatile (LOCK_IF_MP(%4) "cmpxchgl %1,(%3)"
                    : "=a" (exchange_value)
                    : "r" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp)
                    : "cc", "memory");
  return exchange_value;
}

说明：

os::is_MP()用于判断计算机系统是否为多核系统；
__asm__表示接下来是内联的汇编代码；
volatile表示禁止编译器优化；
LOCK_IF_MP是一个内联函数，用于在多核处理器上的指令中添加 lock 前缀，源码：atomic_linux_x86.inline.hpp L147
```
// Adding a lock prefix to an instruction on MP machine
#define LOCK_IF_MP(mp) "cmp $0, " #mp "; je 1f; lock; 1: "
```
最终会调用 cmpxchgl 指令；

总结：

CAS 最终使用 lock cmpxchg 指令实现，如果是单核处理器则没有 lock 前缀；
对于使用 Java 语言而言，CAS 是无锁的；但是对于 CPU 的实现而言，它是可能存在锁的；

附录1

Lock 前缀的作用

引用至：https://www.felixcloutier.com/x86/lock#description

Description ¶

Causes the processor’s LOCK# signal to be asserted during execution of the accompanying instruction (turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any shared memory while the signal is asserted.

In most IA-32 and all Intel 64 processors, locking may occur without the LOCK# signal being asserted. See the “IA-32 Architecture Compatibility” section below for more details.

The LOCK prefix can be prepended only to the following instructions and only to those forms of the instructions where the destination operand is a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. If the LOCK prefix is used with one of these instructions and the source operand is a memory operand, an undefined opcode exception (#UD) may be generated. An undefined opcode exception will also be generated if the LOCK prefix is used with any instruction not in the above list. The XCHG instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK prefix.

The LOCK prefix is typically used with the BTS instruction to perform a read-modify-write operation on a memory location in shared memory environment.

The integrity of the LOCK prefix is not affected by the alignment of the memory field. Memory locking is observed for arbitrarily misaligned fields.

This instruction’s operation is the same in non-64-bit modes and 64-bit mode.

谷歌翻译：

使处理器的 LOCK＃信号在执行伴随指令的过程中被声明（将指令转换为原子指令）。在多处理器环境中，LOCK＃信号可确保在断言该信号时处理器拥有对任何共享内存的独占使用。

在大多数IA-32和所有Intel 64处理器中，锁定可能会在未声明LOCK＃信号的情况下发生。有关更多详细信息，请参见下面的“ IA-32体系结构兼容性”部分。

LOCK 前缀只能加在以下指令之前，并且只能加在目标操作数是存储器操作数的那些形式的指令之前：ADD，ADC，AND，BTC，BTR，BTS，CMPXCHG，CMPXCH8B，CMPXCHG16B，DEC，INC， NEG，NOT，OR，SBB，SUB，XOR，XADD和XCHG。如果LOCK 前缀与这些指令之一一起使用，并且源操作数是内存操作数，则可能会生成未定义的操作码异常（#UD）。如果 LOCK 前缀与上面列表中未包含的任何指令一起使用，也会生成未定义的操作码异常。不管是否存在 LOCK 前缀，XCHG 指令始终声明 LOCK＃信号。

LOCK 前缀通常与 BTS 指令一起使用，以对共享内存环境中的内存位置执行读-修改-写操作。

LOCK 前缀的完整性不受存储字段对齐的影响。对于任意未对齐的字段，观察到内存锁定。

在非64位模式和 64 位模式下，该指令的操作相同。

附录2

CMPXCHG — Compare and Exchange

引用至：https://www.felixcloutier.com/x86/cmpxchg#description

Description ¶

Compares the value in the AL, AX, EAX, or RAX register with the first operand (destination operand). If the two values are equal, the second operand (source operand) is loaded into the destination operand. Otherwise, the destination operand is loaded into the AL, AX, EAX or RAX register. RAX register is available only in 64-bit mode.

This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processor’s bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.)

In 64-bit mode, the instruction’s default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.

谷歌翻译：

将AL，AX，EAX或RAX寄存器中的值与第一个操作数（目标操作数）进行比较。如果两个值相等，则将第二个操作数（源操作数）加载到目标操作数中。否则，目标操作数将被加载到AL，AX，EAX或RAX寄存器中。 RAX寄存器仅在64位模式下可用。

该指令可以与LOCK前缀一起使用，以允许原子执行该指令。为了简化与处理器总线的接口，目标操作数接收一个写周期，而不考虑比较结果。如果比较失败，则写回目标操作数；否则，将写回目标操作数。否则，将源操作数写入目标。（处理器从不产生锁定的读取，而不产生锁定的写入。）

在64位模式下，指令的默认操作大小为32位。使用REX.R前缀允许访问其他寄存器（R8-R15）。使用REX.W前缀可将操作提升到64位。有关数据和限制的编码，请参见本节开头的摘要表。

kenttanl / kenttanl.github.io

JDK: CAS #6

CAS

源码