Hi @zijwang , my guess is that you can use the RPv2 data loader script here and modify the _URL_BASE variable to match the base directory on your filesystem. You should then be able to pass your data loading script to datasets.load_dataset (here is an explanation about this).
I downloaded the data following the instruction here. Is there a recommended way that I can load it via HF API similar to this?