Closed rajveerb closed 1 year ago
The symlink files get cached in memory which leads to inaccurate E2E VTune profiling because the entire dataset is symlink for synthetic dataset in the paper.
If the goal is to only profile preprocessing then the symlink option is great because I/O related CPU time will not be accounted in profiling for fetching from storage into main memory.
Used vmtouch to check if a file is cached in memory.
Given a file in a remote filesystem, check if its content are cached after accessing it once.
Needs to be checked in context of C4130 node in cloudlab using a long term dataset.