gramineproject / graphene

Graphene / Graphene-SGX - a library OS for Linux multi-process applications, with Intel SGX support
https://grapheneproject.io
GNU Lesser General Public License v3.0
771 stars 261 forks source link

[fseek] No such file or directory when loading large input #2478

Closed StanPlatinum closed 3 years ago

StanPlatinum commented 3 years ago

Description of the problem

I am running a bwa (https://github.com/lh3/bwa) application, which is a bio-informatics algorithm, on Graphene.

When I try to feed a large dataset (hg38.fa, about 3.1GB) to the bwa mem command, a [fseek] error come out at the very beginning. When I feed a smaller dataset (mref.fa, about 1.1GB), it works fine.

This bwa application may consume very large memory space. So I set the sgx.enclave_size = "32G". This is the most I can set since I only have a 64G main memory. Well, I cannot set it as "64G" since it would go wrong, not enough memory of course.

Steps to reproduce

PLEASE ENSURE THAT THE ISSUE REPRODUCES ON THE CURRENT MASTER BRANCH

-->

commit ID:6ba10c6

I know that reproduce the issue might be hard. But FYI, I still post my steps below. I wrote a template for Graphene, to run applications more conveniently.

git clone https://github.com/StanPlatinum/graphene-bwa

Then set the Graphene Dir at https://github.com/StanPlatinum/graphene-bwa/blob/main/Makefile#L15 Also, set the Application Dir at https://github.com/StanPlatinum/graphene-bwa/blob/main/Makefile#L12

cd graphene-bwa
make SGX=1 run

The manifest can be found at https://github.com/StanPlatinum/graphene-bwa/blob/main/bwa.manifest.template

And you may need a machine with at least 32G main memory and you may need to download the human genome datasets.

I don't expect you to take too much time to reproduce it. But what I can see is that the error comes out very soon. After a long enclave initialization time, it pops. It seems that the error happens when loading the input data.

I also heard that someone else @ya0guang encountered a similar issue. And I know it might be hard to fix. So I wonder if there is a workaround when we want to load a huge data input?

Expected results

I think this must be a "huge data" issue. Since when I run a smaller dataset, it can give me a correct result.

Actual results

graphene-sgx bwa mem data/hg38_reference.fa data/SRR062634.filt.fastq
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
[fseek] No such file or directory
Makefile:70: recipe for target 'run' failed
make: *** [run] Error 1

Thanks!

pwmarcz commented 3 years ago

@StanPlatinum

Thank you for the report! It seems that we have a bug in handling large files (> 2 GB).

Could you check if PR #2485 fixes your problem? It's a one-line fix:

--- a/LibOS/shim/src/sys/shim_open.c
+++ b/LibOS/shim/src/sys/shim_open.c
@@ -218,7 +218,7 @@ long shim_do_lseek(int fd, off_t offset, int origin) {
     if (!hdl)
         return -EBADF;

-    int ret = 0;
+    off_t ret = 0;
     if (hdl->is_dir) {
         ret = do_lseek_dir(hdl, offset, origin);
         goto out;
StanPlatinum commented 3 years ago

@pwmarcz Thanks!

Yes, the PR fixes it! Feel free to close the issue.