tsy77 / blog

78 stars 2 forks source link

V8源码-内存管理 #13

Open tsy77 opened 6 years ago

tsy77 commented 6 years ago

本文我们将从源码角度来介绍V8引擎的内存管理部分,主要包括内存分配和垃圾回收。

为了聚焦思想,本文采用的V8比较低的版本0.1.5,这个版本实现起来比较简单,大家比较容易的看出实现思想。

内存分配

V8将内存空间分为几个区域,分别是NewSpace、OldSpace、LargeObjectSpace、MapSpace、CodeSpace,各个space的关系如下图所示:

各个space的作用:

LargeObjectSpace :为了避免大对象的拷贝,使用该空间专门存储大对象(大小超过Normal Page能容纳的对象范围),包括Code、Sequetial String、FixedArray;

MapSpace :存放对象的Map信息,即hidden_class;最大限制为8MB;每个Map对象固定大小,为了快速定位,所以将该空间单独出来;

NewSpace :存放多种类型对象,最大限制为2MB;

CodeSpace :存放预编译代码(?);最大限制为512MB;

Old_Pointer_Space :存放GC后surviving的指针对象;最大限制为512MB;

Old_Data_Space :存放GC后surviving的数据对象;最大限制为512MB;

初始化

首先是内存的初始化,这部分在V8初始化完OS的一些参数之后进行初始化,入口文件在src/heap.cc中。代码如下:

bool Heap::Setup(bool create_heap_objects) {
  // Initialize heap spaces and initial maps and objects. Whenever something
  // goes wrong, just return false. The caller should check the results and
  // call Heap::TearDown() to release allocated memory.
  //
  // If the heap is not yet configured (eg, through the API), configure it.
  // Configuration is based on the flags new-space-size (really the semispace
  // size) and old-space-size if set or the initial values of semispace_size_
  // and old_generation_size_ otherwise.
  if (!heap_configured) {
    if (!ConfigureHeap(FLAG_new_space_size, FLAG_old_space_size)) return false;
  }

  // Setup memory allocator and allocate an initial chunk of memory.  The
  // initial chunk is double the size of the new space to ensure that we can
  // find a pair of semispaces that are contiguous and aligned to their size.
  // 分配堆内存,新生代 + 老生代
  // setup chunks
  // MemoryAllocator为单例
  if (!MemoryAllocator::Setup(MaxCapacity())) return false;
  // 预留2 * young_generation_size_虚拟内存,MemoryAllocator::initial_chunk_
  void* chunk
      = MemoryAllocator::ReserveInitialChunk(2 * young_generation_size_);
  if (chunk == NULL) return false;

  // Put the initial chunk of the old space at the start of the initial
  // chunk, then the two new space semispaces, then the initial chunk of
  // code space.  Align the pair of semispaces to their size, which must be
  // a power of 2.
  ASSERT(IsPowerOf2(young_generation_size_));
  Address old_space_start = reinterpret_cast<Address>(chunk);
  // 方向沿低地址区域
  Address new_space_start = RoundUp(old_space_start, young_generation_size_);
  Address code_space_start = new_space_start + young_generation_size_;
  int old_space_size = new_space_start - old_space_start;
  int code_space_size = young_generation_size_ - old_space_size;

  // Initialize new space.
  new_space_ = new NewSpace(initial_semispace_size_, semispace_size_);
  if (new_space_ == NULL) return false;
  // mmap申请from_space、to_space
  if (!new_space_->Setup(new_space_start, young_generation_size_)) return false;

  // Initialize old space, set the maximum capacity to the old generation
  // size.
  // pagedSpace.setup
  old_space_ = new OldSpace(old_generation_size_, OLD_SPACE);
  if (old_space_ == NULL) return false;
  if (!old_space_->Setup(old_space_start, old_space_size)) return false;

  // Initialize the code space, set its maximum capacity to the old
  // generation size.
  code_space_ = new OldSpace(old_generation_size_, CODE_SPACE);
  if (code_space_ == NULL) return false;
  if (!code_space_->Setup(code_space_start, code_space_size)) return false;

  // Initialize map space.
  map_space_ = new MapSpace(kMaxMapSpaceSize);
  if (map_space_ == NULL) return false;
  // Setting up a paged space without giving it a virtual memory range big
  // enough to hold at least a page will cause it to allocate.
  if (!map_space_->Setup(NULL, 0)) return false;

  lo_space_ = new LargeObjectSpace();
  if (lo_space_ == NULL) return false;
  if (!lo_space_->Setup()) return false;

  if (create_heap_objects) {
    // Create initial maps.
    if (!CreateInitialMaps()) return false;
    if (!CreateApiObjects()) return false;

    // Create initial objects
    if (!CreateInitialObjects()) return false;
  }

  LOG(IntEvent("heap-capacity", Capacity()));
  LOG(IntEvent("heap-available", Available()));

  return true;
}

这里主要做了如下几件事:

1.配置Heap参数,包括young_generation_size_(2MB)和old_generation_size_(512MB),这里老生代基于页的内存管理,old_generation_size_表示老生代的内存页数(每页8KB);

2.MemoryAllocator::Setup,初始化chunk用于管理页,一个chunk拥有64页;

3.预留2 * young_generation_size_ 虚拟内存,地址保存在变量MemoryAllocator::initial_chunk_。注意这里MemoryAllocator是个单例;

4.从上一步分配的虚拟内存开始分配各个内存区域

虚拟内存被分成了new_space_, old_space_, code_space_, map_space_, lo_space_, 各个空间按照下图进行划分:

下面着重给大家讲一下各个内存区域是如何初始化的,初始化的代码在src/spaces.cc中。

NewSpace

bool NewSpace::Setup(Address start, int size) {
  ASSERT(size == 2 * maximum_capacity_);
  ASSERT(IsAddressAligned(start, size, 0));

  if (to_space_ == NULL
      || !to_space_->Setup(start, maximum_capacity_)) {
    return false;
  }
  if (from_space_ == NULL
      || !from_space_->Setup(start + maximum_capacity_, maximum_capacity_)) {
    return false;
  }

  start_ = start;
  address_mask_ = ~(size - 1);
  object_mask_ = address_mask_ | kHeapObjectTag;
  object_expected_ = reinterpret_cast<uint32_t>(start) | kHeapObjectTag;

  allocation_info_.top = to_space_->low();
  allocation_info_.limit = to_space_->high();
  mc_forwarding_info_.top = NULL;
  mc_forwarding_info_.limit = NULL;

  ASSERT_SEMISPACE_ALLOCATION_INFO(allocation_info_, to_space_);
  return true;
}

这里主要初始化了to_space_和from_space_,to_space_和from_space_的类型是SemiSpace,初始化代码如下:

bool SemiSpace::Setup(Address start, int size) {
  ASSERT(size == maximum_capacity_);
  if (!MemoryAllocator::CommitBlock(start, capacity_)) return false;

  start_ = start;
  address_mask_ = ~(size - 1);
  object_mask_ = address_mask_ | kHeapObjectTag;
  object_expected_ = reinterpret_cast<uint32_t>(start) | kHeapObjectTag;

  age_mark_ = start_;
  return true;
}

bool MemoryAllocator::CommitBlock(Address start, size_t size) {
  ASSERT(start != NULL);
  ASSERT(size > 0);
  ASSERT(initial_chunk_ != NULL);
  ASSERT(initial_chunk_->address() <= start);
  ASSERT(start + size <= reinterpret_cast<Address>(initial_chunk_->address())
                             + initial_chunk_->size());

  // mmap
  if (!initial_chunk_->Commit(start, size)) return false;
  Counters::memory_allocated.Increment(size);
  return true;
}

这里主要通过MemoryAllocator::CommitBlock去申请了预留的虚拟内存中的区域,initial_chunk_->Commit实际调用的是VirtualMemory::Commit,代码如下:

bool VirtualMemory::Commit(void* address, size_t size) {
  if (MAP_FAILED == mmap(address, size, PROT_READ | PROT_WRITE | PROT_EXEC,
                         MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,
                         kMmapFd, kMmapFdOffset)) {
    return false;
  }

  UpdateAllocatedSpaceLimits(address, size);
  return true;
}

起始就是通过mmap开辟了一块虚拟内存,至于mmap和malloc的关系,大家可以参考Linux内存分配小结--malloc、brk、mmap

OldSpace

OldSpace继承自Pagedspace,old_space_->Setup实际调用的是基类Pagedspace的setup方法,代码如下:

bool PagedSpace::Setup(Address start, size_t size) {
  if (HasBeenSetup()) return false;

  int num_pages = 0;
  // Try to use the virtual memory range passed to us.  If it is too small to
  // contain at least one page, ignore it and allocate instead.
  // 如果在预留的虚拟内存里
  if (PagesInChunk(start, size) > 0) {
    first_page_ = MemoryAllocator::CommitPages(start, size, this, &num_pages);
  } else {
    // 申请多少页
    int requested_pages = Min(MemoryAllocator::kPagesPerChunk,
                              max_capacity_ / Page::kObjectAreaSize);
    first_page_ =
        MemoryAllocator::AllocatePages(requested_pages, &num_pages, this);
    if (!first_page_->is_valid()) return false;
  }

  // We are sure that the first page is valid and that we have at least one
  // page.
  ASSERT(first_page_->is_valid());
  ASSERT(num_pages > 0);
  accounting_stats_.ExpandSpace(num_pages * Page::kObjectAreaSize);
  ASSERT(Capacity() <= max_capacity_);

  for (Page* p = first_page_; p->is_valid(); p = p->next_page()) {
    // 用于Mack-compact内存回收
    p->ClearRSet();
  }

  // Use first_page_ for allocation.
  SetAllocationInfo(&allocation_info_, first_page_);

  return true;
}

这里做了如下几件事:

1.判断预留的虚拟内存里是否可以容纳
    a.可以容纳,MemoryAllocator::CommitPages,直接在预留虚拟内存中分配
    b.空间不足,MemoryAllocator::AllocatePages申请虚拟内存
2.遍历所有页,标记,用于后面垃圾回收

下面主要介绍下MemoryAllocator::CommitPagesMemoryAllocator::AllocatePages

MemoryAllocator::CommitPages在预留的虚拟内存里可以容纳OLD_SPACE时调用,代码如下:

Page* MemoryAllocator::CommitPages(Address start, size_t size,
                                   PagedSpace* owner, int* num_pages) {
  ASSERT(start != NULL);
  *num_pages = PagesInChunk(start, size);
  ASSERT(*num_pages > 0);
  ASSERT(initial_chunk_ != NULL);
  ASSERT(initial_chunk_->address() <= start);
  ASSERT(start + size <= reinterpret_cast<Address>(initial_chunk_->address())
                             + initial_chunk_->size());

  if (!initial_chunk_->Commit(start, size)) {
    return Page::FromAddress(NULL);
  }
  Counters::memory_allocated.Increment(size);

  // So long as we correctly overestimated the number of chunks we should not
  // run out of chunk ids.
  CHECK(!OutOfChunkIds());
  int chunk_id = Pop();
  chunks_[chunk_id].init(start, size, owner);
  return InitializePagesInChunk(chunk_id, *num_pages, owner);
}

这里主要做了两件事:

1.申请预留的虚拟内存,initial_chunk_->Commit,也就是前面介绍过的VirtualMemory::Commit
2.初始化chunks_,这里强调下chunks_是一个chunkinfo类型的数组,里面存储着每个chunk的信息。

MemoryAllocator::AllocatePages在预留的虚拟内存里不足以容纳OLD_SPACE时调用,代码如下:

Page* MemoryAllocator::AllocatePages(int requested_pages, int* allocated_pages,
                                     PagedSpace* owner) {
  if (requested_pages <= 0) return Page::FromAddress(NULL);
  size_t chunk_size = requested_pages * Page::kPageSize;

  // There is not enough space to guarantee the desired number pages can be
  // allocated.
  // 没有足够空间,那就有多大申请多大
  if (size_ + static_cast<int>(chunk_size) > capacity_) {
    // Request as many pages as we can.
    chunk_size = capacity_ - size_;
    requested_pages = chunk_size >> Page::kPageSizeBits;

    if (requested_pages <= 0) return Page::FromAddress(NULL);
  }

  void* chunk = AllocateRawMemory(chunk_size, &chunk_size);
  if (chunk == NULL) return Page::FromAddress(NULL);
  LOG(NewEvent("PagedChunk", chunk, chunk_size));

  *allocated_pages = PagesInChunk(static_cast<Address>(chunk), chunk_size);
  // 不够一页的化,申请无效,释放虚拟内存munmap
  if (*allocated_pages == 0) {
    FreeRawMemory(chunk, chunk_size);
    LOG(DeleteEvent("PagedChunk", chunk));
    return Page::FromAddress(NULL);
  }

  // 初始化新的chunk
  int chunk_id = Pop();
  chunks_[chunk_id].init(static_cast<Address>(chunk), chunk_size, owner);

  return InitializePagesInChunk(chunk_id, *allocated_pages, owner);
}

这里主要做了如下几件事:

1.如果超出了最大的capacity_,算出最多还能申请多少,作为申请的size
2.申请内存MemoryAllocator::AllocateRawMemory
3.判断申请的内存空间是否够一页(8KB),不够一页的化,申请无效,释放虚拟内存munmap
4.初始化新的chunk,这一步与MemoryAllocator::AllocatePages一致

下面我们来看下MemoryAllocator::AllocateRawMemory:

void* MemoryAllocator::AllocateRawMemory(const size_t requested,
                                         size_t* allocated) {
  if (size_ + static_cast<int>(requested) > capacity_) return NULL;

  // mmap & UpdateAllocatedSpaceLimits
  void* mem = OS::Allocate(requested, allocated);
  int alloced = *allocated;
  size_ += alloced;
  Counters::memory_allocated.Increment(alloced);
  return mem;
}

调用的OS::Allocate代码如下:

void* OS::Allocate(const size_t requested, size_t* allocated) {
  const size_t msize = RoundUp(requested, getpagesize());
  void* mbase = mmap(NULL, msize, PROT_READ | PROT_WRITE | PROT_EXEC,
                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
  if (mbase == MAP_FAILED) {
    LOG(StringEvent("OS::Allocate", "mmap failed"));
    return NULL;
  }
  *allocated = msize;
  UpdateAllocatedSpaceLimits(mbase, msize);
  return mbase;
}

这里也是用的mmap,但是第一个参数也就是起始地址为null,这个参数代表要映射到的内存区域的起始地址,这也是跟刚刚使用预留的虚拟内存不同的地方。

这里多说一句,使用预留的虚拟内存有助于增加读写效率,主要因为预留的申请后,不需要修改物理地址和逻辑地址的映射关系,也就是进程的页表。

其他space

其他space与oldspace类似,都是PagedSpace的子类,这里不再赘述。

分配

在具体讲解内存分配之前,我们先讲解几个概念:

1.page,pagedSpace中,内存以page为单位,一个对象不能跨page存储(page的大小与内存页大小一致)
2.扩展堆内存时,以chunk为单位,一个chunk最多包含64个page,这样做可以减少mmap系统调用次数,有利于提高效率;
3.freelist将每一页中的内部碎片收集起来,这里很像操作系统的内存管理

内存分配的入口在src/heap-inl.h中的Heap::AllocateRaw方法,代码如下:

// 内存分配入口
Object* Heap::AllocateRaw(int size_in_bytes, AllocationSpace space) {
  ASSERT(allocation_allowed_ && gc_state_ == NOT_IN_GC);
#ifdef DEBUG
  if (FLAG_gc_interval >= 0 &&
      !disallow_allocation_failure_ &&
      Heap::allocation_timeout_-- <= 0) {
    return Failure::RetryAfterGC(size_in_bytes, space);
  }
  Counters::objs_since_last_full.Increment();
  Counters::objs_since_last_young.Increment();
#endif
  if (NEW_SPACE == space) {
    return new_space_->AllocateRaw(size_in_bytes);
  }

  Object* result;
  if (OLD_SPACE == space) {
    result = old_space_->AllocateRaw(size_in_bytes);
  } else if (CODE_SPACE == space) {
    result = code_space_->AllocateRaw(size_in_bytes);
  } else if (LO_SPACE == space) {
    result = lo_space_->AllocateRaw(size_in_bytes);
  } else {
    ASSERT(MAP_SPACE == space);
    result = map_space_->AllocateRaw(size_in_bytes);
  }
  if (result->IsFailure()) old_gen_exhausted_ = true;
  return result;
}

这里主要根据不同的空间类型,调用其AllocateRaw方法进行内存分配。

OldSpace

这里首先提及一下OldSpace继承与PagedSpace,所以内存管理是基于页的。

AllocateRaw代码如下:

// Allocates requested bytes. May return Failure if the space is full.
  Object* AllocateRaw(int size_in_bytes) {
    ASSERT_OBJECT_SIZE(size_in_bytes);
    return AllocateRawInternal(size_in_bytes, &allocation_info_);
}

AllocateRawInternal代码如下:

Object* OldSpace::AllocateRawInternal(int size_in_bytes,
                                      AllocationInfo* alloc_info) {
  ASSERT(HasBeenSetup());

  if (allocation_mode_ == LINEAR_ONLY || allocation_mode_ == LINEAR) {
    // Try linear allocation in the current page.
    Address cur_top = alloc_info->top;
    Address new_top = cur_top + size_in_bytes;
    if (new_top <= alloc_info->limit) {
      Object* obj = HeapObject::FromAddress(cur_top);
      alloc_info->top = new_top;
      ASSERT_PAGED_ALLOCATION_INFO(*alloc_info);

      accounting_stats_.AllocateBytes(size_in_bytes);
      ASSERT(Size() <= Capacity());
      return obj;
    }
  } else {
    // For now we should not try free list allocation during m-c relocation.
    // 从free_list中申请
    ASSERT(alloc_info == &allocation_info_);
    int wasted_bytes;
    Object* object = free_list_.Allocate(size_in_bytes, &wasted_bytes);
    accounting_stats_.WasteBytes(wasted_bytes);
    if (!object->IsFailure()) {
      accounting_stats_.AllocateBytes(size_in_bytes);
      return object;
    }
  }
  // Fast allocation failed.
  return SlowAllocateRaw(size_in_bytes, alloc_info);
}

这里主要做了如下的事情:

1.判断分配方式是否为线型
    a.如果是,判断当前页剩余空间是否足够分配,
        i.如果空间足够,如果是则划分该区域,将page->top向前移动
        ii.当前页空间不足,跳转至步骤二(进行SlowAllocateRaw)
    b.如果不是,则从free_list_中分配空间,如果分配不成功,跳转至步骤二
2.步骤一的快速分配失败,执行SlowAllocateRaw

SlowAllocateRaw代码如下:

// Slow cases for AllocateRawInternal.  In linear allocation mode, try
// to allocate in the next page in the space.  If there are no more
// pages, switch to free-list allocation if permitted, otherwise try
// to grow the space.  In free-list allocation mode, try to grow the
// space and switch to linear allocation.
Object* OldSpace::SlowAllocateRaw(int size_in_bytes,
                                  AllocationInfo* alloc_info) {
  if (allocation_mode_ == LINEAR_ONLY || allocation_mode_ == LINEAR) {
    // 最后一页,对内存由低地址向高地址
    Page* top_page = TopPageOf(*alloc_info);
    // Until we implement free-list allocation during global gc, we have two
    // cases: one for normal allocation and one for m-c relocation allocation.
    // first_page
    if (alloc_info == &allocation_info_) {  // Normal allocation.
      // 最后一页还剩多少size
      int free_size = top_page->ObjectAreaEnd() - alloc_info->top;
      // Add the extra space at the top of this page to the free list.
      // 直接挪top
      if (free_size > 0) {
        int wasted_bytes = free_list_.Free(alloc_info->top, free_size);
        accounting_stats_.WasteBytes(wasted_bytes);
        alloc_info->top += free_size;
        ASSERT_PAGED_ALLOCATION_INFO(*alloc_info);
      }

      // Move to the next page in this space if there is one; switch
      // to free-list allocation, if we can; try to expand the space otherwise
      // 挪到下一页
      if (top_page->next_page()->is_valid()) {
        SetAllocationInfo(alloc_info, top_page->next_page());
      }
      // allocation_mode_设置成FREE_LIST,从FREE_LIST里分配内存
      else if (allocation_mode_ == LINEAR) {
        allocation_mode_ = FREE_LIST;
      }
      // expand a chunk
      else if (Expand(top_page)) {
        ASSERT(top_page->next_page()->is_valid());
        SetAllocationInfo(alloc_info, top_page->next_page());
      }
      // 回收内存垃圾并重试
      else {
        return Failure::RetryAfterGC(size_in_bytes, identity());
      }
    } else {  // Allocation during m-c relocation.
      // During m-c 'allocation' while computing forwarding addresses, we do
      // not yet add blocks to the free list because they still contain live
      // objects.  We also cache the m-c forwarding allocation pointer in the
      // current page.

      // If there are no more pages try to expand the space.  This can only
      // happen when promoting objects from the new space.
      if (!top_page->next_page()->is_valid()) {
        if (!Expand(top_page)) {
          return Failure::RetryAfterGC(size_in_bytes, identity());
        }
      }

      // Move to the next page.
      ASSERT(top_page->next_page()->is_valid());
      top_page->mc_relocation_top = alloc_info->top;
      SetAllocationInfo(alloc_info, top_page->next_page());
    }
  } else {  // Free-list allocation.
    // We failed to allocate from the free list; try to expand the space and
    // switch back to linear allocation.
    ASSERT(alloc_info == &allocation_info_);
    Page* top_page = TopPageOf(*alloc_info);
    if (!top_page->next_page()->is_valid()) {
      if (!Expand(top_page)) {
        return Failure::RetryAfterGC(size_in_bytes, identity());
      }
    }

    // We surely have more pages, move to the next page and switch to linear
    // allocation.
    ASSERT(top_page->next_page()->is_valid());
    SetAllocationInfo(alloc_info, top_page->next_page());
    ASSERT(allocation_mode_ == FREE_LIST);
    allocation_mode_ = LINEAR;
  }

  // Perform the allocation.
  return AllocateRawInternal(size_in_bytes, alloc_info);
}

这里主要做了如下几件事:

1.利用TopPageOf获取到已分配内存的最后一页top_page
2.拿到最后一页剩余的size
3.如果剩余size大于0,直接将剩余空间给free_list_,同时移动page->top到该页的末尾
4.分配内存空间,这里有四种选择
    a.如果top_page->next_page()有效,也就是当前的top_page下一页有效,那么直接分配,跳转到步骤5,执行AllocateRawInternal
    b.分配方式是线性,则将分配方式变为FREE_LIST,跳转到步骤5,也就是后面会从free_list_中分配内存
    c.重新分配一个chunk(有最大空间限制,在最大空间以内,最多分配64页),跳转到步骤5
    d.上述都没成功,则执行RetryAfterGC,垃圾回收后重试
5.AllocateRawInternal

下面重点讲下Expand,其实调用的是基类PagedSpace的Expand方法,代码如下:

bool PagedSpace::Expand(Page* last_page) {
  ASSERT(max_capacity_ % Page::kObjectAreaSize == 0);
  ASSERT(Capacity() % Page::kObjectAreaSize == 0);

  if (Capacity() == max_capacity_) return false;

  ASSERT(Capacity() < max_capacity_);
  // Last page must be valid and its next page is invalid.
  ASSERT(last_page->is_valid() && !last_page->next_page()->is_valid());

  // 可用的页数
  int available_pages = (max_capacity_ - Capacity()) / Page::kObjectAreaSize;
  if (available_pages <= 0) return false;

  // 最大一个chunk
  int desired_pages = Min(available_pages, MemoryAllocator::kPagesPerChunk);
  Page* p = MemoryAllocator::AllocatePages(desired_pages, &desired_pages, this);
  if (!p->is_valid()) return false;

  accounting_stats_.ExpandSpace(desired_pages * Page::kObjectAreaSize);
  ASSERT(Capacity() <= max_capacity_);

  MemoryAllocator::SetNextPage(last_page, p);

  // Clear remembered set of new pages.
  while (p->is_valid()) {
    p->ClearRSet();
    p = p->next_page();
  }

  return true;
}

这里做了如下几件事:

1.获取还可以分配的最多页数available_pages
2.在available_pages和kPagesPerChunk(64)中取最小值,调用MemoryAllocator::AllocatePages分配,MemoryAllocator::AllocatePages上述已经讲解过,用于分配一个chunk并初始化其中的page
3.整个内存page链表,也就是将新分配的内存接到top_page后面

下面我们总结下oldSpace内存分配的流程图:

NewSpace

NewSpace::AllocateRawInternal代码如下:

Object* NewSpace::AllocateRawInternal(int size_in_bytes,
                                      AllocationInfo* alloc_info) {
  Address new_top = alloc_info->top + size_in_bytes;
  if (new_top > alloc_info->limit) {
    return Failure::RetryAfterGC(size_in_bytes, NEW_SPACE);
  }

  Object* obj = HeapObject::FromAddress(alloc_info->top);
  alloc_info->top = new_top;
#ifdef DEBUG
  SemiSpace* space =
      (alloc_info == &allocation_info_) ? to_space_ : from_space_;
  ASSERT(space->low() <= alloc_info->top
         && alloc_info->top <= space->high()
         && alloc_info->limit == space->high());
#endif
  return obj;
}

这里没有page的概念,直接移动top指针就好,空间不足则直接RetryAfterGC

LargeObjectSpace

LargeObjectSpace::AllocateRawInternal代码如下:

Object* LargeObjectSpace::AllocateRawInternal(int requested_size,
                                              int object_size) {
  ASSERT(0 < object_size && object_size <= requested_size);
  size_t chunk_size;
  LargeObjectChunk* chunk =
      LargeObjectChunk::New(requested_size, &chunk_size);
  if (chunk == NULL) {
    return Failure::RetryAfterGC(requested_size, LO_SPACE);
  }

  size_ += chunk_size;
  page_count_++;
  chunk->set_next(first_chunk_);
  chunk->set_size(chunk_size);
  first_chunk_ = chunk;

  // Set the object address and size in the page header and clear its
  // remembered set.
  Page* page = Page::FromAddress(RoundUp(chunk->address(), Page::kPageSize));
  Address object_address = page->ObjectAreaStart();
  // Clear the low order bit of the second word in the page to flag it as a
  // large object page.  If the chunk_size happened to be written there, its
  // low order bit should already be clear.
  ASSERT((chunk_size & 0x1) == 0);
  page->is_normal_page &= ~0x1;
  page->ClearRSet();
  int extra_bytes = requested_size - object_size;
  if (extra_bytes > 0) {
    // The extra memory for the remembered set should be cleared.
    memset(object_address + object_size, 0, extra_bytes);
  }

  return HeapObject::FromAddress(object_address);
}

LargeObjectChunk* LargeObjectChunk::New(int size_in_bytes,
                                        size_t* chunk_size) {
  size_t requested = ChunkSizeFor(size_in_bytes);
  void* mem = MemoryAllocator::AllocateRawMemory(requested, chunk_size);
  if (mem == NULL) return NULL;
  LOG(NewEvent("LargeObjectChunk", mem, *chunk_size));
  if (*chunk_size < requested) {
    MemoryAllocator::FreeRawMemory(mem, *chunk_size);
    LOG(DeleteEvent("LargeObjectChunk", mem));
    return NULL;
  }
  return reinterpret_cast<LargeObjectChunk*>(mem);
}

由于该空间中每个Page都只会存放一个对象,所以当申请内存块时,直接通过MemoryAllocator::AllocateRawMemory分出一块对象大小的内存,并加入到该空间的内存块管理链表中就可以了。

内存析构

v8实例销毁时会调用V8::TearDown,其中会调用Heap::TearDown,Heap::TearDown代码如下:

oid Heap::TearDown() {
  GlobalHandles::TearDown();

  if (new_space_ != NULL) {
    new_space_->TearDown();
    delete new_space_;
    new_space_ = NULL;
  }

  if (old_space_ != NULL) {
    old_space_->TearDown();
    delete old_space_;
    old_space_ = NULL;
  }

  if (code_space_ != NULL) {
    code_space_->TearDown();
    delete code_space_;
    code_space_ = NULL;
  }

  if (map_space_ != NULL) {
    map_space_->TearDown();
    delete map_space_;
    map_space_ = NULL;
  }

  if (lo_space_ != NULL) {
    lo_space_->TearDown();
    delete lo_space_;
    lo_space_ = NULL;
  }

  MemoryAllocator::TearDown();
}

其实就是把各自空间free掉。

垃圾回收

当对象申请内存空间失败,就会调用Failure::RetryAfterGC,这时会开始进行内存清理。垃圾回收的入口在src/heap-inl.h中,代码如下:

// Do not use the identifier __object__ in a call to this macro.
//
// Call the function FUNCTION_CALL.  If it fails with a RetryAfterGC
// failure, call the garbage collector and retry the function.  If the
// garbage collector cannot reclaim the required space or the second
// call fails with a RetryAfterGC failure, fail with out of memory.
// If there is any other failure, return a null handle.  If either
// call succeeds, return a handle to the functions return value.
//
// Note that this macro always returns or raises a fatal error.
#define CALL_HEAP_FUNCTION(FUNCTION_CALL, TYPE)                              \
  do {                                                                       \
    GC_GREEDY_CHECK();                                                       \
    Object* __object__ = FUNCTION_CALL;                                      \
    if (__object__->IsFailure()) {                                           \
      if (__object__->IsRetryAfterGC()) {                                    \
        if (!Heap::CollectGarbage(                                           \
                Failure::cast(__object__)->requested(),                      \
                Failure::cast(__object__)->allocation_space())) {            \
          /* TODO(1181417): Fix this. */                                     \
          v8::internal::V8::FatalProcessOutOfMemory("CALL_HEAP_FUNCTION");   \
        }                                                                    \
        __object__ = FUNCTION_CALL;                                          \
        if (__object__->IsFailure()) {                                       \
          if (__object__->IsRetryAfterGC()) {                                \
            /* TODO(1181417): Fix this. */                                   \
            v8::internal::V8::FatalProcessOutOfMemory("CALL_HEAP_FUNCTION"); \
          }                                                                  \
          return Handle<TYPE>();                                             \
        }                                                                    \
      } else {                                                               \
        return Handle<TYPE>();                                               \
      }                                                                      \
    }                                                                        \
    return Handle<TYPE>(TYPE::cast(__object__));                             \
  } while (false)

其中调用了Heap::CollectGarbage,代码如下:

bool Heap::CollectGarbage(int requested_size, AllocationSpace space) {
  // The VM is in the GC state until exiting this function.
  VMState state(GC);

#ifdef DEBUG
  // Reset the allocation timeout to the GC interval, but make sure to
  // allow at least a few allocations after a collection. The reason
  // for this is that we have a lot of allocation sequences and we
  // assume that a garbage collection will allow the subsequent
  // allocation attempts to go through.
  allocation_timeout_ = Max(6, FLAG_gc_interval);
#endif

  { GCTracer tracer;
    GarbageCollectionPrologue();

    GarbageCollector collector = SelectGarbageCollector(space);
    tracer.set_collector(collector);

    StatsRate* rate = (collector == SCAVENGER)
        ? &Counters::gc_scavenger
        : &Counters::gc_compactor;
    rate->Start();
    PerformGarbageCollection(space, collector);
    rate->Stop();

    GarbageCollectionEpilogue();
  }

#ifdef ENABLE_LOGGING_AND_PROFILING
  if (FLAG_log_gc) HeapProfiler::WriteSample();
#endif

  switch (space) {
    case NEW_SPACE:
      return new_space_->Available() >= requested_size;
    case OLD_SPACE:
      return old_space_->Available() >= requested_size;
    case CODE_SPACE:
      return code_space_->Available() >= requested_size;
    case MAP_SPACE:
      return map_space_->Available() >= requested_size;
    case LO_SPACE:
      return lo_space_->Available() >= requested_size;
  }
  return false;
}

这里主要做了两件事:

1.SelectGarbageCollector,选择垃圾回收器
2.PerformGarbageCollection,执行垃圾回收

Heap::SelectGarbageCollector选择垃圾回收器的代码如下:

GarbageCollector Heap::SelectGarbageCollector(AllocationSpace space) {
  // Is global GC requested?
  if (space != NEW_SPACE || FLAG_gc_global) {
    Counters::gc_compactor_caused_by_request.Increment();
    return MARK_COMPACTOR;
  }

  // Is enough data promoted to justify a global GC?
  if (PromotedSpaceSize() > promoted_space_limit_) {
    Counters::gc_compactor_caused_by_promoted_data.Increment();
    return MARK_COMPACTOR;
  }

  // Have allocation in OLD and LO failed?
  if (old_gen_exhausted_) {
    Counters::gc_compactor_caused_by_oldspace_exhaustion.Increment();
    return MARK_COMPACTOR;
  }

  // Is there enough space left in OLD to guarantee that a scavenge can
  // succeed?
  //
  // Note that old_space_->MaxAvailable() undercounts the memory available
  // for object promotion. It counts only the bytes that the memory
  // allocator has not yet allocated from the OS and assigned to any space,
  // and does not count available bytes already in the old space or code
  // space.  Undercounting is safe---we may get an unrequested full GC when
  // a scavenge would have succeeded.
  if (old_space_->MaxAvailable() <= new_space_->Size()) {
    Counters::gc_compactor_caused_by_oldspace_exhaustion.Increment();
    return MARK_COMPACTOR;
  }

  // Default
  return SCAVENGER;
}

垃圾回收算法主要有MarkCompact和Scavenge两种,这里有四种情况会返回MARK_COMPACTOR垃圾回收器,其余情况会返回SCAVENGER。四种情况分别是:

1.space不是NEW_SPACE或者是全局GC
2.提升空间(PromotedSpaceSize)的剩余空间大于提升空间的最大限制
3.之前在old_space_或lo_space_中分配失败
4.old_space_剩余空间小于new_space_的空间

Heap::PerformGarbageCollection执行垃圾回收的代码如下:

void Heap::PerformGarbageCollection(AllocationSpace space,
                                    GarbageCollector collector) {
  if (collector == MARK_COMPACTOR && global_gc_prologue_callback_) {
    ASSERT(!allocation_allowed_);
    global_gc_prologue_callback_();
  }

  if (collector == MARK_COMPACTOR) {
    MarkCompact();

    int promoted_space_size = PromotedSpaceSize();
    promoted_space_limit_ =
        promoted_space_size + Max(2 * MB, (promoted_space_size/100) * 35);
    old_gen_exhausted_ = false;

    // If we have used the mark-compact collector to collect the new
    // space, and it has not compacted the new space, we force a
    // separate scavenge collection.  THIS IS A HACK.  It covers the
    // case where (1) a new space collection was requested, (2) the
    // collector selection policy selected the mark-compact collector,
    // and (3) the mark-compact collector policy selected not to
    // compact the new space.  In that case, there is no more (usable)
    // free space in the new space after the collection compared to
    // before.
    if (space == NEW_SPACE && !MarkCompactCollector::HasCompacted()) {
      Scavenge();
    }
  } else {
    Scavenge();
  }
  Counters::objs_since_last_young.Set(0);

  // Process weak handles post gc.
  GlobalHandles::PostGarbageCollectionProcessing();

  if (collector == MARK_COMPACTOR && global_gc_epilogue_callback_) {
    ASSERT(!allocation_allowed_);
    global_gc_epilogue_callback_();
  }
}

这里主要根据选出的collector,在不同的space中执行不同的垃圾回收算法(MarkCompact或Scavenge)。

Scavenge

下面我们来看一下这两种垃圾回收算法:

新生代使用Scavenge算法进行回收。在Scavenge算法的实现中,主要采用了Cheney算法。

Cheney算法算法是一种采用复制的方式实现的垃圾回收算法。它将内存一分为二,每一部分空间称为semispace。在这两个semispace中,一个处于使用状态,另一个处于闲置状态。处于使用状态的semispace空间称为From空间,处于闲置状态的空间称为To空间,当我们分配对象时,先是在From空间中进行分配。当开始进行垃圾回收算法时,会检查From空间中的存活对象,这些存活对象将会被复制到To空间中(复制完成后会进行紧缩),而非活跃对象占用的空间将会被释放。完成复制后,From空间和To空间的角色发生对换。也就是说,在垃圾回收的过程中,就是通过将存活对象在两个semispace之间进行复制。可以很容易看出来,使用Cheney算法时,总有一半的内存是空的。但是由于新生代很小,所以浪费的内存空间并不大。而且由于新生代中的对象绝大部分都是非活跃对象,需要复制的活跃对象比例很小,所以其时间效率十分理想。复制的过程采用的是BFS(广度优先遍历)的思想,从根对象出发,广度优先遍历所有能到达的对象

需要注意的是,v8中的from_space_和to_space_与算法中描述的正好相反,对象在to_space_中分配,from_space_作为复制的目标空间。

下面是Heap::Scavenge的代码:

void Heap::Scavenge() {
#ifdef DEBUG
  if (FLAG_enable_slow_asserts) {
    VerifyCodeSpacePointersVisitor v;
    HeapObjectIterator it(code_space_);
    while (it.has_next()) {
      HeapObject* object = it.next();
      if (object->IsCode()) {
        Code::cast(object)->ConvertICTargetsFromAddressToObject();
      }
      object->Iterate(&v);
      if (object->IsCode()) {
        Code::cast(object)->ConvertICTargetsFromObjectToAddress();
      }
    }
  }
#endif

  gc_state_ = SCAVENGE;

  // Implements Cheney's copying algorithm
  LOG(ResourceEvent("scavenge", "begin"));

  // 为了避免newspace由于空间过小儿引起频繁地scavenge,于是在每次scavenge之前检查次数,如果超过限制次数(初始为8)且newspace能满足空间翻倍(初始为256KB,最大为2MB),则double空间以及该次数限制。这里的策略调整可以根据实际优化;
  scavenge_count_++;
  if (new_space_->Capacity() < new_space_->MaximumCapacity() &&
      scavenge_count_ > new_space_growth_limit_) {
    // Double the size of the new space, and double the limit.  The next
    // doubling attempt will occur after the current new_space_growth_limit_
    // more collections.
    // TODO(1240712): NewSpace::Double has a return value which is
    // ignored here.
    new_space_->Double();
    new_space_growth_limit_ *= 2;
  }

  // Flip the semispaces.  After flipping, to space is empty, from space has
  // live objects.
  // 交换from_space和to_space
  // 两个semispace的信息互换
  new_space_->Flip();
  // 重置allocation_info_
  new_space_->ResetAllocationInfo();

  // We need to sweep newly copied objects which can be in either the to space
  // or the old space.  For to space objects, we use a mark.  Newly copied
  // objects lie between the mark and the allocation top.  For objects
  // promoted to old space, we write their addresses downward from the top of
  // the new space.  Sweeping newly promoted objects requires an allocation
  // pointer and a mark.  Note that the allocation pointer 'top' actually
  // moves downward from the high address in the to space.
  //
  // There is guaranteed to be enough room at the top of the to space for the
  // addresses of promoted objects: every object promoted frees up its size in
  // bytes from the top of the new space, and objects are at least one pointer
  // in size.  Using the new space to record promoted addresses makes the
  // scavenge collector agnostic to the allocation strategy (eg, linear or
  // free-list) used in old space.
  // promoted object的指针在new_space中从后向前记录
  // to_space_->low()
  Address new_mark = new_space_->ToSpaceLow();
  Address promoted_mark = new_space_->ToSpaceHigh();
  promoted_top = new_space_->ToSpaceHigh();

  CopyVisitor copy_visitor;
  // Copy roots.
  IterateRoots(&copy_visitor);

  // Copy objects reachable from the old generation.  By definition, there
  // are no intergenerational pointers in code space.
  IterateRSet(old_space_, &CopyObject);
  IterateRSet(map_space_, &CopyObject);
  lo_space_->IterateRSet(&CopyObject);

  bool has_processed_weak_pointers = false;

  while (true) {
    ASSERT(new_mark <= new_space_->top());
    ASSERT(promoted_mark >= promoted_top);

    // Copy objects reachable from newly copied objects.
    // 广度优先遍历
    // 相等的时候停止
    // allocation_info_.top
    while (new_mark < new_space_->top() || promoted_mark > promoted_top) {
      // Sweep newly copied objects in the to space.  The allocation pointer
      // can change during sweeping.
      Address previous_top = new_space_->top();
      SemiSpaceIterator new_it(new_space_, new_mark);
      while (new_it.has_next()) {
        new_it.next()->Iterate(&copy_visitor);
      }
      new_mark = previous_top;

      // Sweep newly copied objects in the old space.  The promotion 'top'
      // pointer could change during sweeping.
      previous_top = promoted_top;
      for (Address current = promoted_mark - kPointerSize;
           current >= previous_top;
           current -= kPointerSize) {
        HeapObject* object = HeapObject::cast(Memory::Object_at(current));
        object->Iterate(&copy_visitor);
        UpdateRSet(object);
      }
      promoted_mark = previous_top;
    }

    if (has_processed_weak_pointers) break;  // We are done.
    // Copy objects reachable from weak pointers.
    GlobalHandles::IterateWeakRoots(&copy_visitor);
    has_processed_weak_pointers = true;
  }

  // Set age mark.
  new_space_->set_age_mark(new_mark);

  LOG(ResourceEvent("scavenge", "end"));

  gc_state_ = NOT_IN_GC;
}

这里其实就是依据上述的算法思想来执行相应逻辑,主要做了如下几件事:

1.为了避免newspace由于空间过小引起频繁地scavenge,每次scavenge之前检查已经scavenge的次数,如果超过限制次数(初始为8)且newspace能满足空间翻倍(初始为256KB,最大为2MB),则double空间以及该次数限制。
2.交换from_space和to_space,两个semispace的信息互换
3.重置allocation_info_
4.IterateRoots,拷贝from_space(交换前的to_space)中的root对象(包括strong_root_list、struct_map、symbol、bootstrapper、top、debug、compilation cache、handlescope、builtins、globalhandles、threadmanager等)到to_space
5.在to_space中广度优先遍历各个节点,对to_space中引用的其他对象执行复制操作
6.最后再对global handle list中处于weak或pengding状态的对象进行拷贝

下面我们挑几个重要的点详细讲解一下。

IterateRoots

IterateRoots(&copy_visitor)用来将根对象拷贝到to_space_,具体代码如下:

void Heap::IterateRoots(ObjectVisitor* v) {
  IterateStrongRoots(v);
  // copy object
  v->VisitPointer(reinterpret_cast<Object**>(&symbol_table_));
  SYNCHRONIZE_TAG("symbol_table");
}

void Heap::IterateStrongRoots(ObjectVisitor* v) {
#define ROOT_ITERATE(type, name) \
  v->VisitPointer(reinterpret_cast<Object**>(&name##_));
  STRONG_ROOT_LIST(ROOT_ITERATE);
#undef ROOT_ITERATE
  SYNCHRONIZE_TAG("strong_root_list");

#define STRUCT_MAP_ITERATE(NAME, Name, name) \
  v->VisitPointer(reinterpret_cast<Object**>(&name##_map_));
  STRUCT_LIST(STRUCT_MAP_ITERATE);
#undef STRUCT_MAP_ITERATE
  SYNCHRONIZE_TAG("struct_map");

#define SYMBOL_ITERATE(name, string) \
  v->VisitPointer(reinterpret_cast<Object**>(&name##_));
  SYMBOL_LIST(SYMBOL_ITERATE)
#undef SYMBOL_ITERATE
  SYNCHRONIZE_TAG("symbol");

  Bootstrapper::Iterate(v);
  SYNCHRONIZE_TAG("bootstrapper");
  Top::Iterate(v);
  SYNCHRONIZE_TAG("top");
  Debug::Iterate(v);
  SYNCHRONIZE_TAG("debug");

  // Iterate over local handles in handle scopes.
  HandleScopeImplementer::Iterate(v);
  SYNCHRONIZE_TAG("handlescope");

  // Iterate over the builtin code objects and code stubs in the heap. Note
  // that it is not strictly necessary to iterate over code objects on
  // scavenge collections.  We still do it here because this same function
  // is used by the mark-sweep collector and the deserializer.
  Builtins::IterateBuiltins(v);
  SYNCHRONIZE_TAG("builtins");

  // Iterate over global handles.
  GlobalHandles::IterateRoots(v);
  SYNCHRONIZE_TAG("globalhandles");

  // Iterate over pointers being held by inactive threads.
  ThreadManager::Iterate(v);
  SYNCHRONIZE_TAG("threadmanager");
}

这里拷贝了strong_root_list等根对象,拷贝的关键在于v->VisitPointer(reinterpret_cast<Object**>(&name##_));这段逻辑,这段逻辑实际调用的是CopyVisitor::VisitPointer,具体代码如下:

// Helper class for copying HeapObjects
class CopyVisitor: public ObjectVisitor {
 public:

  void VisitPointer(Object** p) {
    CopyObject(p);
  }

  void VisitPointers(Object** start, Object** end) {
    // Copy all HeapObject pointers in [start, end)
    for (Object** p = start; p < end; p++) CopyObject(p);
  }

 private:
  void CopyObject(Object** p) {
    if (!Heap::InFromSpace(*p)) return;
    Heap::CopyObject(reinterpret_cast<HeapObject**>(p));
  }
};

我们可以看到,实际调用的是Heap::CopyObject,代码如下:

void Heap::CopyObject(HeapObject** p) {
  ASSERT(InFromSpace(*p));

  HeapObject* object = *p;

  // We use the first word (where the map pointer usually is) of a
  // HeapObject to record the forwarding pointer.  A forwarding pointer can
  // point to the old space, the code space, or the to space of the new
  // generation.
  // 获取类型
  // (reinterpret_cast<byte*>(p) + offset - kHeapObjectTag)
  HeapObject* first_word = object->map();

  // If the first word (where the map pointer is) is not a map pointer, the
  // object has already been copied.  We do not use first_word->IsMap()
  // because we know that first_word always has the heap object tag.
  // 如果被引用的对象(live objects)已经被拷贝到to_space_,则简单地更新引用,通过forwarding pointer指向新的to_space_中的新对象
  if (first_word->map()->instance_type() != MAP_TYPE) {
    *p = first_word;
    return;
  }

  // Optimization: Bypass ConsString objects where the right-hand side is
  // Heap::empty_string().  We do not use object->IsConsString because we
  // already know that object has the heap object tag.
  InstanceType type = Map::cast(first_word)->instance_type();
  if (type < FIRST_NONSTRING_TYPE &&
      String::cast(object)->representation_tag() == kConsStringTag &&
      ConsString::cast(object)->second() == Heap::empty_string()) {
    object = HeapObject::cast(ConsString::cast(object)->first());
    *p = object;
    // After patching *p we have to repeat the checks that object is in the
    // active semispace of the young generation and not already copied.
    if (!InFromSpace(object)) return;
    first_word = object->map();
    if (first_word->map()->instance_type() != MAP_TYPE) {
      *p = first_word;
      return;
    }
    type = Map::cast(first_word)->instance_type();
  }

  int object_size = object->SizeFromMap(Map::cast(first_word));
  Object* result;
  // If the object should be promoted, we try to copy it to old space.
  // 上一次scavenge中survive(两次scavenge都survivi) 或 to_space可用空间少于75%时
  if (ShouldBePromoted(object->address(), object_size)) {
    // Heap numbers and sequential strings are promoted to code space, all
    // other object types are promoted to old space.  We do not use
    // object->IsHeapNumber() and object->IsSeqString() because we already
    // know that object has the heap object tag.
    bool has_pointers =
        type != HEAP_NUMBER_TYPE &&
        (type >= FIRST_NONSTRING_TYPE ||
         String::cast(object)->representation_tag() != kSeqStringTag);
    if (has_pointers) {
      result = old_space_->AllocateRaw(object_size);
    } else {
      result = code_space_->AllocateRaw(object_size);
    }

    if (!result->IsFailure()) {
      // object->set_map()
      // set forwarding pointer
      *p = MigrateObject(p, HeapObject::cast(result), object_size);
      if (has_pointers) {
        // Record the object's address at the top of the to space, to allow
        // it to be swept by the scavenger.
        promoted_top -= kPointerSize;
        Memory::Object_at(promoted_top) = *p;
      } else {
#ifdef DEBUG
        // Objects promoted to the code space should not have pointers to
        // new space.
        VerifyCodeSpacePointersVisitor v;
        (*p)->Iterate(&v);
#endif
      }
      return;
    }
  }

  // The object should remain in new space or the old space allocation failed.
  result = new_space_->AllocateRaw(object_size);
  // Failed allocation at this point is utterly unexpected.
  ASSERT(!result->IsFailure());
  *p = MigrateObject(p, HeapObject::cast(result), object_size);
}

这里主要做了如下几件事:

1.如果被引用的对象(live objects)已经被拷贝到to_space_,则简单地更新引用,通过forwarding pointer指向新的to_space_中的新对象。这里需要注意的是from_space中的对象map pointer指向拷贝到to_space_或old_space中的新对象,这里也是通过这个map pointer来判断是否已经被拷贝
2.判断是对象否需要提升至old_space_或code_space_,如果是,则在相应空间上分配空间并设置promoted_top。这里需要注意的是为了后面的广度优先遍历,需要记录已经提升的对象地址,所以,在to_space_中,会在to_space中从空间末尾开始,从后向前记录提升的对象地址,promoted_top代表提升变量地址的最顶端(反向)
3.最后,如果对象需要在new_space中分配或者old_space空间分配失败,则调用new_space_->AllocateRaw,并设置from_space中原有对象forwarding pointer指向新的to_space_中的新对象

这里需要注意下ShouldBePromoted方法,也就是对象晋升的条件,代码如下:

bool Heap::ShouldBePromoted(Address old_address, int object_size) {
  // An object should be promoted if:
  // - the object has survived a scavenge operation or
  // - to space is already 25% full.
  return old_address < new_space_->age_mark()
      || (new_space_->Size() + object_size) >= (new_space_->Capacity() >> 2);
}

从上述代码中可以看出,有两个条件可以触发变量提升:

1.To空间已经被使用了超过25%
2.对象在上一次scavenge中survive(两次scavenge都survive)

这里new_space_->age_mark()实际记录的是new_mark,记录的是to_space中以分配空间的最顶端,这里表示copy过去的对象的最顶端,后面介绍广度优先遍历to_space_大家会看到。

广度优先遍历to_space_

广度优先遍历to_space中已有的对象指针,其中包含新拷贝到to_space_中的对象和新拷贝到old_space_中的对象。其在to_space里遍历新拷贝到to_space_中的对象和新拷贝到old_space_中的对象时,各用到两个指针(mark、top)来表示遍历的位置,当四个指针两两重合时,遍历结束。

具体代码如下:

// Copy objects reachable from newly copied objects.
    // 广度优先遍历
    // 相等的时候停止
    // allocation_info_.top
    while (new_mark < new_space_->top() || promoted_mark > promoted_top) {
      // Sweep newly copied objects in the to space.  The allocation pointer
      // can change during sweeping.
      Address previous_top = new_space_->top();
      SemiSpaceIterator new_it(new_space_, new_mark);
      while (new_it.has_next()) {
        new_it.next()->Iterate(&copy_visitor);
      }
      new_mark = previous_top;

      // Sweep newly copied objects in the old space.  The promotion 'top'
      // pointer could change during sweeping.
      previous_top = promoted_top;
      for (Address current = promoted_mark - kPointerSize;
           current >= previous_top;
           current -= kPointerSize) {
        HeapObject* object = HeapObject::cast(Memory::Object_at(current));
        object->Iterate(&copy_visitor);
        UpdateRSet(object);
      }
      promoted_mark = previous_top;
    }

为了解释几个指针的作用,看如下场景:

看了上面的一堆东西,大家可能会看的云里雾里,可以看下浅谈V8引擎中的垃圾回收机制里面的例子,写的比较清楚。

Mark-Compact

Mark-Compact算法主要包含两个阶段:

1.标记阶段,找到并标记所有live objects
2.整理/清除阶段,在此阶段会通过将live objects复制到一个新的连续空间的方式对堆内存进行整理

Mark-Compact代码如下:

void Heap::MarkCompact() {
  gc_state_ = MARK_COMPACT;
#ifdef DEBUG
  mc_count_++;
#endif
  LOG(ResourceEvent("markcompact", "begin"));

  MarkCompactPrologue();

  MarkCompactCollector::CollectGarbage();

  MarkCompactEpilogue();

  LOG(ResourceEvent("markcompact", "end"));

  gc_state_ = NOT_IN_GC;

  Shrink();

  Counters::objs_since_last_full.Set(0);
}

这里主要做了垃圾回收的序幕(设置标记位等)、垃圾回收、垃圾回收的收尾(与序幕对应)、空间压缩这几件事,下面将对其进行详细介绍。

MarkCompactPrologue

MarkCompactPrologue代码如下:

void Heap::MarkCompactPrologue() {
  // 清除缓存的编译代码
  RegExpImpl::OldSpaceCollectionPrologue();
  // 对当前线程中所有栈帧中的所有stackhandler进行操作,将其程序计数器(PC)设置为偏移量,代替原来的绝对地址
  Top::MarkCompactPrologue();
  // 对所有线程执行上述操作
  ThreadManager::MarkCompactPrologue();
}

这里主要做了两件事:

1.清除缓存的编译代码
2.对所有线程的栈帧中的所有stack handle重新设置其pc_address(绝对值改成偏移量)

Top::MarkCompactPrologue();调用的代码如下:

void Top::MarkCompactPrologue(ThreadLocalTop* thread) {
  StackFrame::CookFramesForThread(thread);
}

void StackFrame::CookFramesForThread(ThreadLocalTop* thread) {
  ASSERT(!thread->stack_is_cooked());
  for (StackFrameIterator it(thread); !it.done(); it.Advance()) {
    it.frame()->Cook();
  }
  thread->set_stack_is_cooked(true);
}

void StackFrame::Cook() {
  Code* code = FindCode();
  for (StackHandlerIterator it(this, top_handler()); !it.done(); it.Advance()) {
    it.handler()->Cook(code);
  }
  ASSERT(code->contains(pc()));
  set_pc(AddressFrom<Address>(pc() - code->instruction_start()));
}

void StackHandler::Cook(Code* code) {
  ASSERT(code->contains(pc()));
  set_pc(AddressFrom<Address>(pc() - code->instruction_start()));
}

我们从上述代码可以看到,最终调用了StackHandler::set_pc来设置其pc_address。

MarkCompactCollector::CollectGarbage

这里进入到了MarkCompact的垃圾回收阶段,代码如下:

void MarkCompactCollector::CollectGarbage() {
  // 一些准备工作
  Prepare();

  // 从root object开始深度优先遍历,标记live objects
  MarkLiveObjects();

  // 清理LargeObjectSpace,释放un_marked的LargeObject
  SweepLargeObjectSpace();

  if (compacting_collection_) {
    // 分配空间、在old_object的map指针或from_space的对应位置记录new_object地址
    EncodeForwardingAddresses();

    // 更新所有指向live_objects的pointer,使其指向新对象地址
    UpdatePointers();

    // 把原对象内存拷贝至新的对象内存
    // 利用memmove
    RelocateObjects();

    // 重制remember_set
    // remember_set are sparse, faster (eg, binary) search for set bits
    RebuildRSets();

  } else {
    // 回收没有被标记的内存
    SweepSpaces();
  }

  Finish();
}

这里主要做了如下几件事:

1.获取compacting_collection_,决定后面的过程是否需要内存整理
2.从root object开始深度优先遍历,标记live objects
3.清理LargeObjectSpace,释放un_marked的LargeObject
4.判断需要整理,也就是`compacting_collection_`是否为`true`
    a.需要整理
        i.分配空间、在old_object的map指针或from_space的对应位置记录new_object地址
        ii.更新所有指向live_objects的pointer,使其指向新对象地址
        iii.用memmove方法把原对象内存拷贝至新的对象内存
        iv.重制remember_set(remember_set用来快速搜索bit)
    b.不需要整理,只是清除即可
        i.回收没有被标记的内存
5.Finish,清空StubCache。

这里分阶段详细讲解下:

Prepare

Prepare用来获取现有状态是否需要进行内存整理,还是只回收就好。代码如下:

void MarkCompactCollector::Prepare() {
  static const int kFragmentationLimit = 50;  // Percent.
#ifdef DEBUG
  ASSERT(state_ == IDLE);
  state_ = PREPARE_GC;
#endif
  ASSERT(!FLAG_always_compact || !FLAG_never_compact);

  compacting_collection_ = FLAG_always_compact;

  // We compact the old generation if it gets too fragmented (ie, we could
  // recover an expected amount of space by reclaiming the waste and free
  // list blocks).  We always compact when the flag --gc-global is true
  // because objects do not get promoted out of new space on non-compacting
  // GCs.
  // 碎片化严重时进行compact
  // 当--gc-global为true时,进行compact
  if (!compacting_collection_) {
    // 可恢复的空间
    // 整个空间有 Size(已用) + Waste(浪费) + AvailableFree(剩余) 组成
    int old_gen_recoverable = Heap::old_space()->Waste()
                            + Heap::old_space()->AvailableFree()
                            + Heap::code_space()->Waste()
                            + Heap::code_space()->AvailableFree();
    int old_gen_used = old_gen_recoverable
                     + Heap::old_space()->Size()
                     + Heap::code_space()->Size();
    int old_gen_fragmentation = (old_gen_recoverable * 100) / old_gen_used;
    // old_gen_fragmentation > 50
    if (old_gen_fragmentation > kFragmentationLimit) {
      compacting_collection_ = true;
    }
  }

  if (FLAG_never_compact) compacting_collection_ = false;

#ifdef DEBUG
  if (compacting_collection_) {
    // We will write bookkeeping information to the remembered set area
    // starting now.
    // page设置成NOT_IN_USE
    Page::set_rset_state(Page::NOT_IN_USE);
  }
#endif

  Heap::map_space()->PrepareForMarkCompact(compacting_collection_);
  Heap::old_space()->PrepareForMarkCompact(compacting_collection_);
  Heap::code_space()->PrepareForMarkCompact(compacting_collection_);

  Counters::global_objects.Set(0);

#ifdef DEBUG
  live_bytes_ = 0;
  live_young_objects_ = 0;
  live_old_objects_ = 0;
  live_immutable_objects_ = 0;
  live_map_objects_ = 0;
  live_lo_objects_ = 0;
#endif
}

这里碎片化严重时进行compact,具体的计算使用 可回收的空间/总空间,当大于50%时,使用compact。

MarkLiveObjects

MarkLiveObjects用来标记所有的live object,从root object开始深度优先遍历,代码如下:

// 遍历,为live object标记mark
void MarkCompactCollector::MarkLiveObjects() {
#ifdef DEBUG
  ASSERT(state_ == PREPARE_GC);
  state_ = MARK_LIVE_OBJECTS;
#endif
  // The to space contains live objects, the from space is used as a marking
  // stack.
  marking_stack.Initialize(Heap::new_space()->FromSpaceLow(),
                           Heap::new_space()->FromSpaceHigh());

  // 返回marking_stack.is_overflow
  ASSERT(!marking_stack.overflowed());

  // Mark the heap roots, including global variables, stack variables, etc.
  // 遍历根对象,set_mark & push stack
  MarkingVisitor marking_visitor;
  Heap::IterateStrongRoots(&marking_visitor);

  // Take care of the symbol table specially.
  SymbolTable* symbol_table = SymbolTable::cast(Heap::symbol_table());
#ifdef DEBUG
  UpdateLiveObjectCount(symbol_table);
#endif

  // 1. mark the prefix of the symbol table and push the objects on
  // the stack.
  symbol_table->IteratePrefix(&marking_visitor);
  // 2. mark the symbol table without pushing it on the stack.
  set_mark(symbol_table);  // map word is changed.

  bool has_processed_weak_pointers = false;

  // Mark objects reachable from the roots.
  while (true) {
    // 深度优先遍历,标记、入栈
    MarkObjectsReachableFromTopFrame();

    if (!marking_stack.overflowed()) {
      if (has_processed_weak_pointers) break;
      // First we mark weak pointers not yet reachable.
      GlobalHandles::MarkWeakRoots(&MustBeMarked);
      // Then we process weak pointers and process the transitive closure.
      GlobalHandles::IterateWeakRoots(&marking_visitor);
      has_processed_weak_pointers = true;
      continue;
    }

    // The marking stack overflowed, we need to rebuild it by scanning the
    // whole heap.
    marking_stack.clear_overflowed();

    // We have early stops if the stack overflowed again while scanning
    // overflowed objects in a space.
    SemiSpaceIterator new_it(Heap::new_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&new_it);
    if (marking_stack.overflowed()) continue;

    HeapObjectIterator old_it(Heap::old_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&old_it);
    if (marking_stack.overflowed()) continue;

    HeapObjectIterator code_it(Heap::code_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&code_it);
    if (marking_stack.overflowed()) continue;

    HeapObjectIterator map_it(Heap::map_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&map_it);
    if (marking_stack.overflowed()) continue;

    LargeObjectIterator lo_it(Heap::lo_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&lo_it);
  }

  // Prune the symbol table removing all symbols only pointed to by
  // the symbol table.
  SymbolTableCleaner v;
  symbol_table->IterateElements(&v);
  symbol_table->ElementsRemoved(v.PointersRemoved());

#ifdef DEBUG
  if (FLAG_verify_global_gc) VerifyHeapAfterMarkingPhase();
#endif

  // Remove object groups after marking phase.
  GlobalHandles::RemoveObjectGroups();

  // Objects in the active semispace of the young generation will be relocated
  // to the inactive semispace.  Set the relocation info to the beginning of
  // the inactive semispace.
  Heap::new_space()->MCResetRelocationInfo();
}

void MarkCompactCollector::MarkObjectsReachableFromTopFrame() {
  MarkingVisitor marking_visitor;
  do {
    while (!marking_stack.is_empty()) {
      // marking_stack出栈
      HeapObject* obj = marking_stack.Pop();
      ASSERT(Heap::Contains(obj));
      ASSERT(is_marked(obj) && !is_overflowed(obj));

      // Because the object is marked, the map pointer is not tagged as a
      // normal HeapObject pointer, we need to recover the map pointer,
      // then use the map pointer to mark the object body.
      intptr_t map_word = reinterpret_cast<intptr_t>(obj->map());
      Map* map = reinterpret_cast<Map*>(clear_mark_bit(map_word));
      MarkObject(map);
      // 遍历其子节点
      obj->IterateBody(map->instance_type(), obj->SizeFromMap(map),
                       &marking_visitor);
    };
    // Check objects in object groups.
    MarkObjectGroups(&marking_visitor);
  } while (!marking_stack.is_empty());
}

这里主要做了两件事:

1.初始化栈(深度优先遍历使用),这里直接使用new_space的from_space作为标记所用的栈
2.遍历根节点相连对象,标记、入栈
3.将栈中对象一个个pop出来,进行深度优先遍历标记

标记的具体代码如下:

// Mark object pointed to by p.
  void MarkObjectByPointer(Object** p) {
    Object* obj = *p;
    if (!obj->IsHeapObject()) return;

    // Optimization: Bypass ConsString object where right size is
    // Heap::empty_string().
    // Please note this checks performed equals:
    //   object->IsConsString() &&
    //   (ConsString::cast(object)->second() == Heap::empty_string())
    // except the map for the object might be marked.
    intptr_t map_word =
        reinterpret_cast<intptr_t>(HeapObject::cast(obj)->map());
    uint32_t tag =
        (reinterpret_cast<Map*>(clear_mark_bit(map_word)))->instance_type();
    if ((tag < FIRST_NONSTRING_TYPE) &&
        (kConsStringTag ==
         static_cast<StringRepresentationTag>(tag &
                                              kStringRepresentationMask)) &&
        (Heap::empty_string() ==
         reinterpret_cast<String*>(
             reinterpret_cast<ConsString*>(obj)->second()))) {
      // Since we don't have the object start it is impossible to update the
      // remeber set quickly.  Therefore this optimization only is taking
      // place when we can avoid changing.
      Object* first = reinterpret_cast<ConsString*>(obj)->first();
      if (Heap::InNewSpace(obj) || !Heap::InNewSpace(first)) {
        obj = first;
        *p = obj;
      }
    }
    MarkCompactCollector::MarkObject(HeapObject::cast(obj));
  }

主要是通过对象的size是否与Heap::empty_string()相同来判断对象是否是活着的,标记过程中通过obj的map pointer记录标记,然后入栈。

compact

当需要进行整理时,主要进行如下几个步骤:

EncodeForwardingAddresses

EncodeForwardingAddresses用来分配空间并在old_object的map指针或from_space的对应位置记录new_object地址。代码如下

void MarkCompactCollector::EncodeForwardingAddresses() {
  ASSERT(state_ == ENCODE_FORWARDING_ADDRESSES);
  // Compute the forwarding pointers in each space.
  // 分配新空间、在old_object的map指针上记录new_object的地址
  // &mc_allocation_info记录分配信息(top等)
  EncodeForwardingAddressesInPagedSpace<MCAllocateFromOldSpace,
                                        IgnoreNonLiveObject>(
      Heap::old_space());

  EncodeForwardingAddressesInPagedSpace<MCAllocateFromCodeSpace,
                                        LogNonLiveCodeObject>(
      Heap::code_space());

  // Compute new space next to last after the old and code spaces have been
  // compacted.  Objects in new space can be promoted to old or code space.
  // 这里新生代对象有可能提升到老生代
  // 需要注意的是,new space使用from_space来帮助记录新对象地址,因为to_space和from_space空间大小相同,所以用from_space相同offset的位置记录相应的new_object地址的
  EncodeForwardingAddressesInNewSpace();

  // Compute map space last because computing forwarding addresses
  // overwrites non-live objects.  Objects in the other spaces rely on
  // non-live map pointers to get the sizes of non-live objects.
  EncodeForwardingAddressesInPagedSpace<MCAllocateFromMapSpace,
                                        IgnoreNonLiveObject>(
      Heap::map_space());

  // Write relocation info to the top page, so we can use it later.  This is
  // done after promoting objects from the new space so we get the correct
  // allocation top.
  Heap::old_space()->MCWriteRelocationInfoToPage();
  Heap::code_space()->MCWriteRelocationInfoToPage();
  Heap::map_space()->MCWriteRelocationInfoToPage();
}

void MarkCompactCollector::SweepSpaces() {
  ASSERT(state_ == SWEEP_SPACES);
  ASSERT(!IsCompacting());
  // Noncompacting collections simply sweep the spaces to clear the mark
  // bits and free the nonlive blocks (for old and map spaces).  We sweep
  // the map space last because freeing non-live maps overwrites them and
  // the other spaces rely on possibly non-live maps to get the sizes for
  // non-live objects.
  SweepSpace(Heap::old_space(), &DeallocateOldBlock);
  SweepSpace(Heap::code_space(), &DeallocateCodeBlock);
  SweepSpace(Heap::new_space());
  SweepSpace(Heap::map_space(), &DeallocateMapBlock);
}

这里分别对所有内存space进行操作。下面主要介绍对old_space和new_space的操作。

对old_space的分配、标记的代码如下:

template<MarkCompactCollector::AllocationFunction Alloc,
         MarkCompactCollector::ProcessNonLiveFunction ProcessNonLive>
void MarkCompactCollector::EncodeForwardingAddressesInPagedSpace(
    PagedSpace* space) {
  PageIterator it(space, PageIterator::PAGES_IN_USE);
  while (it.has_next()) {
    Page* p = it.next();
    // The offset of each live object in the page from the first live object
    // in the page.
    int offset = 0;
    // 为marked Object分配新内存,在老对象中存储新对象偏移量
    // 以Page(内存页)为单位进行处理
    EncodeForwardingAddressesInRange<Alloc,
                                     EncodeForwardingAddressInPagedSpace,
                                     ProcessNonLive>(
        p->ObjectAreaStart(),
        p->AllocationTop(),
        &offset);
  }
}

这里循环遍历每一页,对其执行EncodeForwardingAddressesInRange方法。具体代码如下:

// Function template that, given a range of addresses (eg, a semispace or a
// paged space page), iterates through the objects in the range to clear
// mark bits and compute and encode forwarding addresses.  As a side effect,
// maximal free chunks are marked so that they can be skipped on subsequent
// sweeps.
//
// The template parameters are an allocation function, a forwarding address
// encoding function, and a function to process non-live objects.
template<MarkCompactCollector::AllocationFunction Alloc,
         MarkCompactCollector::EncodingFunction Encode,
         MarkCompactCollector::ProcessNonLiveFunction ProcessNonLive>
inline void EncodeForwardingAddressesInRange(Address start,
                                             Address end,
                                             int* offset) {
  // The start address of the current free region while sweeping the space.
  // This address is set when a transition from live to non-live objects is
  // encountered.  A value (an encoding of the 'next free region' pointer)
  // is written to memory at this address when a transition from non-live to
  // live objects is encountered.
  Address free_start = NULL;

  // A flag giving the state of the previously swept object.  Initially true
  // to ensure that free_start is initialized to a proper address before
  // trying to write to it.
  bool is_prev_alive = true;

  int object_size;  // Will be set on each iteration of the loop.
  for (Address current = start; current < end; current += object_size) {
    HeapObject* object = HeapObject::FromAddress(current);
    if (is_marked(object)) {
      clear_mark(object);
      object_size = object->Size();

      Object* forwarded = Alloc(object, object_size);
      // Allocation cannot fail, because we are compacting the space.
      ASSERT(!forwarded->IsFailure());
      Encode(object, object_size, forwarded, offset);

#ifdef DEBUG
      if (FLAG_gc_verbose) {
        PrintF("forward %p -> %p.\n", object->address(),
               HeapObject::cast(forwarded)->address());
      }
#endif
      if (!is_prev_alive) {  // Transition from non-live to live.
        EncodeFreeRegion(free_start, current - free_start);
        is_prev_alive = true;
      }
    } else {  // Non-live object.
      object_size = object->Size();
      ProcessNonLive(object);
      if (is_prev_alive) {  // Transition from live to non-live.
        free_start = current;
        is_prev_alive = false;
      }
    }
  }

  // If we ended on a free region, mark it.
  if (!is_prev_alive) EncodeFreeRegion(free_start, end - free_start);
}

EncodeForwardingAddressesInRange中主要做了两件事:

1.分配新内存 Alloc
2.记录新对象地址,Encode

其中Encode调用了传入模版的EncodeForwardingAddressInPagedSpace方法在old_object中标记new_object的地址,代码如下:

// The forwarding address is encoded in the map pointer of the object as an
// offset (in terms of live bytes) from the address of the first live object
// in the page.
// forwarding addres用距离当前页第一个live object的距离来表示
// 存储在对象的map pointer中
inline void EncodeForwardingAddressInPagedSpace(HeapObject* old_object,
                                                int object_size,
                                                Object* new_object,
                                                int* offset) {
  // Record the forwarding address of the first live object if necessary.
  if (*offset == 0) {
    Page::FromAddress(old_object->address())->mc_first_forwarded =
        HeapObject::cast(new_object)->address();
  }

  uint32_t encoded = EncodePointers(old_object->map()->address(), *offset);
  old_object->set_map(reinterpret_cast<Map*>(encoded));
  *offset += object_size;
  ASSERT(*offset <= Page::kObjectAreaSize);
}

这里需要注意的是在old_object所在页的mc_first_forwarded属性上记录了给第一个live object分配的新对象地址,在old_object中的map指针上只记录new_object距离第一个live object的new_object的距离(forwarding address),同时forwarding address还包含当前页的一些信息(page_index等)。

对新生代分配、标记的代码如下:

// Functions to encode the forwarding pointers in each compactable space.
void MarkCompactCollector::EncodeForwardingAddressesInNewSpace() {
  int ignored;
  EncodeForwardingAddressesInRange<MCAllocateFromNewSpace,
                                   EncodeForwardingAddressInNewSpace,
                                   IgnoreNonLiveObject>(
      Heap::new_space()->bottom(),
      Heap::new_space()->top(),
      &ignored);
}

EncodeForwardingAddressesInRange与old_space中的操作一样,分配和记录两项工作,只不过对应函数不同。

新生代内存分配之前讲过,这里与老生代有一点不同的是会优先通过对象晋升来分配新内存,代码如下:

// Try to promote all objects in new space.  Heap numbers and sequential
// strings are promoted to the code space, all others to the old space.
inline Object* MCAllocateFromNewSpace(HeapObject* object, int object_size) {
  bool has_pointers = !object->IsHeapNumber() && !object->IsSeqString();
  Object* forwarded = has_pointers ?
      Heap::old_space()->MCAllocateRaw(object_size) :
      Heap::code_space()->MCAllocateRaw(object_size);

  if (forwarded->IsFailure()) {
    forwarded = Heap::new_space()->MCAllocateRaw(object_size);
  }
  return forwarded;
}

这里可以看到,先在old_space分配,分配失败才会在new_space上分配。

新生代old_object记录new_object地址的方式也跟old_space不同,主要看如下代码:

// The forwarding address is encoded at the same offset as the current
// to-space object, but in from space.
// 再from_space的相同offset的位置记录new_object的地址
// new_object可能提升到老生代,也可能还在新生代
inline void EncodeForwardingAddressInNewSpace(HeapObject* old_object,
                                              int object_size,
                                              Object* new_object,
                                              int* ignored) {
  int offset =
      Heap::new_space()->ToSpaceOffsetForAddress(old_object->address());
  Memory::Address_at(Heap::new_space()->FromSpaceLow() + offset) =
      HeapObject::cast(new_object)->address();
}

这里可以看到,使用的是from_space的相同offset的位置记录new_object的地址。这里利用new space使用from_space来帮助记录新对象地址,因为to_space和from_space空间大小相同,所以用from_space相同offset的位置记录相应的new_object地址的。

UpdatePointers

UpdatePointers代码如下:

void MarkCompactCollector::UpdatePointers() {
#ifdef DEBUG
  ASSERT(state_ == ENCODE_FORWARDING_ADDRESSES);
  state_ = UPDATE_POINTERS;
#endif
  UpdatingVisitor updating_visitor;
  Heap::IterateRoots(&updating_visitor);
  GlobalHandles::IterateWeakRoots(&updating_visitor);

  int live_maps = IterateLiveObjects(Heap::map_space(),
                                     &UpdatePointersInOldObject);
  int live_olds = IterateLiveObjects(Heap::old_space(),
                                     &UpdatePointersInOldObject);
  int live_immutables = IterateLiveObjects(Heap::code_space(),
                                           &UpdatePointersInOldObject);
  int live_news = IterateLiveObjects(Heap::new_space(),
                                     &UpdatePointersInNewObject);

  // Large objects do not move, the map word can be updated directly.
  LargeObjectIterator it(Heap::lo_space());
  while (it.has_next()) UpdatePointersInNewObject(it.next());

  USE(live_maps);
  USE(live_olds);
  USE(live_immutables);
  USE(live_news);

#ifdef DEBUG
  ASSERT(live_maps == live_map_objects_);
  ASSERT(live_olds == live_old_objects_);
  ASSERT(live_immutables == live_immutable_objects_);
  ASSERT(live_news == live_young_objects_);

  if (FLAG_verify_global_gc) VerifyHeapAfterUpdatingPointers();
#endif
}

这里其实是更新所有指向live_object的pointer,使其指向新地址:

1.遍历root object,更新其指针指向新分配的对象
2.遍历所有space,更新其指针

这里挑一些重要的点给大家讲解一下:

获取新地址并更新的操作在MarkCompactCollector::UpdatePointer中,代码如下:

// 获取新地址并更新
void MarkCompactCollector::UpdatePointer(Object** p) {
  // We need to check if p is in to_space.
  if (!(*p)->IsHeapObject()) return;

  HeapObject* obj = HeapObject::cast(*p);
  Address old_addr = obj->address();
  Address new_addr;

  ASSERT(!Heap::InFromSpace(obj));

  if (Heap::new_space()->Contains(obj)) {
    Address f_addr = Heap::new_space()->FromSpaceLow() +
                     Heap::new_space()->ToSpaceOffsetForAddress(old_addr);
    new_addr = Memory::Address_at(f_addr);

#ifdef DEBUG
    ASSERT(Heap::old_space()->Contains(new_addr) ||
           Heap::code_space()->Contains(new_addr) ||
           Heap::new_space()->FromSpaceContains(new_addr));

    if (Heap::new_space()->FromSpaceContains(new_addr)) {
      ASSERT(Heap::new_space()->FromSpaceOffsetForAddress(new_addr) <=
             Heap::new_space()->ToSpaceOffsetForAddress(old_addr));
    }
#endif

  } else if (Heap::lo_space()->Contains(obj)) {
    // Don't move objects in the large object space.
    new_addr = obj->address();

  } else {
    ASSERT(Heap::old_space()->Contains(obj) ||
           Heap::code_space()->Contains(obj) ||
           Heap::map_space()->Contains(obj));

    new_addr = GetForwardingAddressInOldSpace(obj);
    ASSERT(Heap::old_space()->Contains(new_addr) ||
           Heap::code_space()->Contains(new_addr) ||
           Heap::map_space()->Contains(new_addr));

#ifdef DEBUG
    if (Heap::old_space()->Contains(obj)) {
      ASSERT(Heap::old_space()->MCSpaceOffsetForAddress(new_addr) <=
             Heap::old_space()->MCSpaceOffsetForAddress(old_addr));
    } else if (Heap::code_space()->Contains(obj)) {
      ASSERT(Heap::code_space()->MCSpaceOffsetForAddress(new_addr) <=
             Heap::code_space()->MCSpaceOffsetForAddress(old_addr));
    } else {
      ASSERT(Heap::map_space()->MCSpaceOffsetForAddress(new_addr) <=
             Heap::map_space()->MCSpaceOffsetForAddress(old_addr));
    }
#endif
  }

  *p = HeapObject::FromAddress(new_addr);

#ifdef DEBUG
  if (FLAG_gc_verbose) {
    PrintF("update %p : %p -> %p\n",
           reinterpret_cast<Address>(p), old_addr, new_addr);
  }
#endif
}

这里更新指针操作直接赋值就好,主要是是获取地址,其过程如下:

1.如果是新生代对象,直接从from_space中相同offset的地方获取就好
2.老生代通过GetForwardingAddressInOldSpace方法获取

GetForwardingAddressInOldSpace代码如下:

Address MarkCompactCollector::GetForwardingAddressInOldSpace(HeapObject* obj) {
  // Object should either in old or map space.
  uint32_t encoded = reinterpret_cast<uint32_t>(obj->map());

  // Offset to the first live object's forwarding address.
  int offset = DecodeOffset(encoded);
  Address obj_addr = obj->address();

  // Find the first live object's forwarding address.
  Page* p = Page::FromAddress(obj_addr);
  Address first_forwarded = p->mc_first_forwarded;

  // Page start address of forwarded address.
  Page* forwarded_page = Page::FromAddress(first_forwarded);
  int forwarded_offset = forwarded_page->Offset(first_forwarded);

  // Find end of allocation of in the page of first_forwarded.
  Address mc_top = forwarded_page->mc_relocation_top;
  int mc_top_offset = forwarded_page->Offset(mc_top);

  // Check if current object's forward pointer is in the same page
  // as the first live object's forwarding pointer
  // 在当前页
  if (forwarded_offset + offset < mc_top_offset) {
    // In the same page.
    return first_forwarded + offset;
  }

  // 不在当前页属时,顺延至下一页
  // Must be in the next page, NOTE: this may cross chunks.
  Page* next_page = forwarded_page->next_page();
  ASSERT(next_page->is_valid());

  offset -= (mc_top_offset - forwarded_offset);
  offset += Page::kObjectStartOffset;

  ASSERT_PAGE_OFFSET(offset);
  ASSERT(next_page->OffsetToAddress(offset) < next_page->mc_relocation_top);

  return next_page->OffsetToAddress(offset);
}

这里做了如下几件事:

1.获取当前page的mc_first_forwarded,也就是新分配的第一个对象地址
2.取出对应offset
3.判断是否在一页当中
    a.在一页中,直接返回first_forwarded + offset就好
    b.不在一页中(forwarded_offset + offset大于一页),在下一页中分配,这里需要重新更新下offset,然后使用next_page->OffsetToAddress(offset)获取地址
RelocateObjects

RelocateObjects将原对象的拷贝到新对象的内存中,代码如下:

void MarkCompactCollector::RelocateObjects() {
#ifdef DEBUG
  ASSERT(state_ == UPDATE_POINTERS);
  state_ = RELOCATE_OBJECTS;
#endif
  // Relocates objects, always relocate map objects first. Relocating
  // objects in other space relies on map objects to get object size.
  int live_maps = IterateLiveObjects(Heap::map_space(), &RelocateMapObject);
  int live_olds = IterateLiveObjects(Heap::old_space(), &RelocateOldObject);
  int live_immutables =
      IterateLiveObjects(Heap::code_space(), &RelocateCodeObject);
  int live_news = IterateLiveObjects(Heap::new_space(), &RelocateNewObject);

  USE(live_maps);
  USE(live_olds);
  USE(live_immutables);
  USE(live_news);
#ifdef DEBUG
  ASSERT(live_maps == live_map_objects_);
  ASSERT(live_olds == live_old_objects_);
  ASSERT(live_immutables == live_immutable_objects_);
  ASSERT(live_news == live_young_objects_);
#endif

  // Notify code object in LO to convert IC target to address
  // This must happen after lo_space_->Compact
  LargeObjectIterator it(Heap::lo_space());
  while (it.has_next()) { ConvertCodeICTargetToAddress(it.next()); }

  // Flips from and to spaces
  Heap::new_space()->Flip();

  // Sets age_mark to bottom in to space
  Address mark = Heap::new_space()->bottom();
  Heap::new_space()->set_age_mark(mark);

  Heap::new_space()->MCCommitRelocationInfo();
#ifdef DEBUG
  // It is safe to write to the remembered sets as remembered sets on a
  // page-by-page basis after committing the m-c forwarding pointer.
  Page::set_rset_state(Page::IN_USE);
#endif
  Heap::map_space()->MCCommitRelocationInfo();
  Heap::old_space()->MCCommitRelocationInfo();
  Heap::code_space()->MCCommitRelocationInfo();

#ifdef DEBUG
  if (FLAG_verify_global_gc) VerifyHeapAfterRelocatingObjects();
#endif
}

这里对所有空间的对象进行遍历,然后进行复制,复制的代码如下:

int MarkCompactCollector::RelocateMapObject(HeapObject* obj) {
  // decode map pointer (forwarded address)
  uint32_t encoded = reinterpret_cast<uint32_t>(obj->map());
  Address map_addr = DecodeMapPointer(encoded, Heap::map_space());
  ASSERT(Heap::map_space()->Contains(HeapObject::FromAddress(map_addr)));

  // Get forwarding address before resetting map pointer
  Address new_addr = GetForwardingAddressInOldSpace(obj);

  // recover map pointer
  obj->set_map(reinterpret_cast<Map*>(HeapObject::FromAddress(map_addr)));

  // The meta map object may not be copied yet.
  Address old_addr = obj->address();

  if (new_addr != old_addr) {
    memmove(new_addr, old_addr, Map::kSize);  // copy contents
  }

#ifdef DEBUG
  if (FLAG_gc_verbose) {
    PrintF("relocate %p -> %p\n", old_addr, new_addr);
  }
#endif

  return Map::kSize;
}

主要做了两件事:

1.获取新地址(与上面讲解的获取新地址逻辑相同)
2.利用memmove方法对内存空间进行复制
RebuildRSets
void MarkCompactCollector::RebuildRSets() {
#ifdef DEBUG
  ASSERT(state_ == RELOCATE_OBJECTS);
  state_ = REBUILD_RSETS;
#endif
  Heap::RebuildRSets();
}

这里主要对remember_set进行充值,remember_set用于快速(例如,二进制)搜索标记位

SweepSpaces

SweepSpaces用于清理内存空间而不会像compact去重新整理,当然这里的工作的也是在标记的基础上去做的,SweepSpaces入口代码如下:

void MarkCompactCollector::SweepSpaces() {
  ASSERT(state_ == SWEEP_SPACES);
  ASSERT(!IsCompacting());
  // Noncompacting collections simply sweep the spaces to clear the mark
  // bits and free the nonlive blocks (for old and map spaces).  We sweep
  // the map space last because freeing non-live maps overwrites them and
  // the other spaces rely on possibly non-live maps to get the sizes for
  // non-live objects.
  SweepSpace(Heap::old_space(), &DeallocateOldBlock);
  SweepSpace(Heap::code_space(), &DeallocateCodeBlock);
  SweepSpace(Heap::new_space());
  SweepSpace(Heap::map_space(), &DeallocateMapBlock);
}

这里主要对各个空间进行SweepSpace操作,这里同样对pagedSpace和newSpace的操作不同(函数冲载)。

对于pagedSpace,SweepSpace代码如下:

static void SweepSpace(PagedSpace* space, DeallocateFunction dealloc) {
  PageIterator it(space, PageIterator::PAGES_IN_USE);
  // 遍历每一页
  while (it.has_next()) {
    Page* p = it.next();

    bool is_previous_alive = true;
    Address free_start = NULL;
    HeapObject* object;

    for (Address current = p->ObjectAreaStart();
         current < p->AllocationTop();
         current += object->Size()) {
      object = HeapObject::FromAddress(current);
      if (is_marked(object)) {
        clear_mark(object);
        if (MarkCompactCollector::IsCompacting() && object->IsCode()) {
          // If this is compacting collection marked code objects have had
          // their IC targets converted to objects.
          // They need to be converted back to addresses.
          Code::cast(object)->ConvertICTargetsFromObjectToAddress();
        }
        if (!is_previous_alive) {  // Transition from free to live.
          dealloc(free_start, current - free_start);
          is_previous_alive = true;
        }
      } else {
        if (object->IsCode()) {
          LOG(CodeDeleteEvent(Code::cast(object)->address()));
        }
        if (is_previous_alive) {  // Transition from live to free.
          free_start = current;
          is_previous_alive = false;
        }
      }
      // The object is now unmarked for the call to Size() at the top of the
      // loop.
    }

    // If the last region was not live we need to from free_start to the
    // allocation top in the page.
    if (!is_previous_alive) {
      int free_size = p->AllocationTop() - free_start;
      if (free_size > 0) {
        dealloc(free_start, free_size);
      }
    }
  }
}

这里遍历每一页中每一个object,如果没有标记,说明需要清除,调用传入的DeallocateFunction,old_space传入的DeallocateOldBlock方法如下:

void MarkCompactCollector::DeallocateOldBlock(Address start,
                                              int size_in_bytes) {
  Heap::ClearRSetRange(start, size_in_bytes);
  Heap::old_space()->Free(start, size_in_bytes);
}

也就是清空空间,加入到free_list中。

新生代的SweepSpace代码如下:

static void SweepSpace(NewSpace* space) {
  HeapObject* object;
  for (Address current = space->bottom();
       current < space->top();
       current += object->Size()) {
    object = HeapObject::FromAddress(current);
    if (is_marked(object)) {
      clear_mark(object);
    } else {
      // We give non-live objects a map that will correctly give their size,
      // since their existing map might not be live after the collection.
      // 更新对象map,因为其对应的map将再下面的sweepSpace中被释放
      int size = object->Size();
      if (size >= Array::kHeaderSize) {
        object->set_map(Heap::byte_array_map());
        ByteArray::cast(object)->set_length(ByteArray::LengthFor(size));
      } else {
        ASSERT(size == kPointerSize);
        object->set_map(Heap::one_word_filler_map());
      }
      ASSERT(object->Size() == size);
    }
    // The object is now unmarked for the call to Size() at the top of the
    // loop.
  }
}

这里直接更新对象对应的map指针,因为其对应的map将再下面的sweepSpace中被释放。

Finish

Finish用来清空StubCache。代码如下:

void MarkCompactCollector::Finish() {
#ifdef DEBUG
  ASSERT(state_ == SWEEP_SPACES || state_ == REBUILD_RSETS);
  state_ = IDLE;
#endi
  // The stub cache is not traversed during GC; clear the cache to
  // force lazy re-initialization of it. This must be done after the
  // GC, because it relies on the new address of certain old space
  // objects (empty string, illegal builtin).
  StubCache::Clear();
}

Stub一般会含有已优化的代码,来处理某个IC(内联缓存)之前所碰到的特定类型的操作。一旦Stub碰到了优化代码无法解决的操作,它会调用C++运行时代码来进行处理。运行时代码处理了这个操作之后,会生成一个新的Stub,包含解决这个操作的方案(当然也包括之前的其他方案)。

Shrink

Shrink用于空间的收缩,分别对map_space_、old_space_、code_space_进行操作,代码如下:

void Heap::Shrink() {
  // Try to shrink map, old, and code spaces.
  map_space_->Shrink();
  old_space_->Shrink();
  code_space_->Shrink();
}

最终都会调用PagedSpace::Shrink方法,代码如下:

void PagedSpace::Shrink() {
  // Release half of free pages.
  // 释放后一般
  Page* top_page = AllocationTopPage();
  ASSERT(top_page->is_valid());

  // Loop over the pages from the top page to the end of the space to count
  // the number of pages to keep and find the last page to keep.
  int free_pages = 0;
  int pages_to_keep = 0;  // Of the free pages.
  Page* last_page_to_keep = top_page;
  Page* current_page = top_page->next_page();
  // Loop over the pages to the end of the space.
  while (current_page->is_valid()) {
    // Keep every odd-numbered page, one page for every two in the space.
    if ((free_pages & 0x1) == 1) {
      pages_to_keep++;
      last_page_to_keep = last_page_to_keep->next_page();
    }
    free_pages++;
    current_page = current_page->next_page();
  }

  // Free pages after last_page_to_keep, and adjust the next_page link.
  Page* p = MemoryAllocator::FreePages(last_page_to_keep->next_page());
  MemoryAllocator::SetNextPage(last_page_to_keep, p);

  // Since pages are only freed in whole chunks, we may have kept more than
  // pages_to_keep.
  while (p->is_valid()) {
    pages_to_keep++;
    p = p->next_page();
  }

  // The difference between free_pages and pages_to_keep is the number of
  // pages actually freed.
  ASSERT(pages_to_keep <= free_pages);
  int bytes_freed = (free_pages - pages_to_keep) * Page::kObjectAreaSize;
  accounting_stats_.ShrinkSpace(bytes_freed);

  ASSERT(Capacity() == CountTotalPages() * Page::kObjectAreaSize);
}

这里其实是释放掉了pagedSpace的后一半,如下图:

总结

本文从源码的角度介绍了V8的内存管理,可能大家会说对日常工作毫无作用,但读下来感觉还是很有意思,拓展了很多知识。

参考文献

V8之内存管理 浅谈V8引擎中的垃圾回收机制 V8 之旅:Full Compiler