RiS3-Lab / ModelXRay

On-device Machine Learning model analyzer and extractor for Android Apps, check out our USENIX Security'21 paper "Mind Your Weight(s): A Large-scale Study on Insufficient Machine Learning Model Protection in Mobile Apps"
Other
27 stars 3 forks source link

Frida scripts maybe doesn't work as wish. #1

Closed MGYN closed 3 years ago

MGYN commented 3 years ago

Hi, I am testing the Frida scripts generated by command "-f" to dump the model which is encrypted, but I found some issues that can lead to the dumped model not valid. I tested some apps and got no correct dumped model, so there are some issues about the scripts hope you can answer, thanks!

Valid the model

In the scripts, you dumped all the memories which are called free(), and customize functions, but there may hundreds of "free()" functions, so that will dump hundreds of suspect models, and how do you know which is the correct decrypted model? In your paper, you said you test it by Protobuf or netron, you save the model as a ".pb" file, so is there any automatic script to validate the model file? If yes, hope you can open the source.

Dump size

In your paper, you said you tried to dump the memory which size is small or equal to the encrypted model size, but in your scripts, you set the size as *101024, and in "hookfunc()", the size is only 100 if it recognized the start of "0x0A"**, maybe it should modify as follow? dumpFreeBuffer(args[0], bfsize, args[0], bfsize);

Format

In table 6 of your paper, you said you get the unknown format extracted by strategy 0, and because it is an unknown format so how can you judge and valid it?

Hope your answers, thanks!

MGYN commented 3 years ago

The command I use is "-j".

sunzc commented 3 years ago

Thanks for your interests :)

Validate the model

Did you try the pull_and_analyze.sh script ? The output of the script is a file called pb.result, which will help you guess whether there is a buffer contains a model. For the suspected buffer, the tool will automatically generate hexdumped version of the buffer, it will help us locate the start address of the model. Unfortunately, locating the actual address is manual effort with the knowledge of protobuf encoding feature. Once you know the start address, then the extractpb.sh will help extract the protobuf part from the buffer. The output can be further verified by protobuf decoder, we use this tool https://protogen.marcgravell.com/decode, as it can decode incomplete buffer(we only dump the first 10KB of the suspected buffer).

Dump size

We dump first 10KB of big suspected buffer(size > 500KB or 1MB). Buffer start with "0x0A"(protobuf encoding signature) has a posibility of containing a model located from the begining of the buffer(usually not the case), so we dump the first few bytes for quick verification.

A little bit background about how the instrumentation works:

When use modelxray with -j, it will generate several version instrumentation scripts. (intercept_all, intercept_fw(framework), intercept_magic) All scripts share a common part that will instrument the malloc/free function. The different part for the scripts is that what other functions to instrument, it is for experimental purpose, we can skip it here. Let's focus on the mallo/free. The malloc and free will be called many times. We only instrument them that are called in the ML SDK libraries. We instrument the malloc to collect the malloced buffer size and only dump the buffer when the buffer is big enough to contains a model. (buffer pointer stored in a bml, big malloc list in the javascript code). We tried different size, 500K, 1MB, you can customize it depends on the app you are analyzing, some use big models some use small models.

"In your paper, you said you tried to dump the memory which size is small or equal to the encrypted model size" A bit misunderstanding here. I just quoted part of section 5.1 from the paper:

"Naive instrumentation of deallocation APIs can lead to dramatic app slowdown. We optimize it by first only activating it after the ML library is loaded, and second, only for buffers greater than the minimum model size (a configurable threshold). To get buffer size, memory allocation APIs (e.g.,malloc) are instrumented as well. The size information also helps correlate a decrypted model to its encrypted version (discussed in §5.3)"

So we use the buffer size information to match the suspected model buffer with the actual encrypted model file.

Format

For some model buffers, we are not sure about the actual encoding format, but based on the strings information, like model layer and other meta data information(conv, relu), and the buffer size information(exactly the same the model file) we are pretty sure it's the decrypted model buffer.

Disclaimer

We do not intend to provide a powerful tool that can extract the complete encrypted models, and we do not encourage misusing this tool as it will jeopardize the model vendors. The goal of this project is to raise the awareness of the model privacy problem.

MGYN commented 3 years ago

So, can I understand that the purpose of the dump script is to show the feasibility of get the decrypted models, but it also needs extra effort to getting the complete models? And in this project, it is not the main work to get complete and recognize the format, so as long as we can get some strings like "Conv"、“TFL3” in the top bytes, it is enough to show the feasibility. Above strings may only aim to the protobuf serialization, like other serialization models, the same buffer size as the model also can amply be demonstrated.

To summarize, so to get complete models is not implemented in this project, and the cases(discussed in table 6) is also shown the feasibility rather than shown the result of getting the complete model.

sunzc commented 3 years ago

Yes, it takes extra effort to get the complete models.

For the cases discussed in table 6, if it's a known format(e.g. protobuf), during our responsible disclosure, we did take an extra step to locate the model and provide the model vendor with the decrypted model header(first 10KB) which can be parsed and verified with protobuf decoder to show the feasibility.

Aaron911 commented 3 years ago

Thanks for your interests :)

Validate the model

Did you try the pull_and_analyze.sh script ? The output of the script is a file called pb.result, which will help you guess whether there is a buffer contains a model. For the suspected buffer, the tool will automatically generate hexdumped version of the buffer, it will help us locate the start address of the model. Unfortunately, locating the actual address is manual effort with the knowledge of protobuf encoding feature. Once you know the start address, then the extractpb.sh will help extract the protobuf part from the buffer. The output can be further verified by protobuf decoder, we use this tool https://protogen.marcgravell.com/decode, as it can decode incomplete buffer(we only dump the first 10KB of the suspected buffer).

Dump size

We dump first 10KB of big suspected buffer(size > 500KB or 1MB). Buffer start with "0x0A"(protobuf encoding signature) has a posibility of containing a model located from the begining of the buffer(usually not the case), so we dump the first few bytes for quick verification.

A little bit background about how the instrumentation works:

When use modelxray with -j, it will generate several version instrumentation scripts. (intercept_all, intercept_fw(framework), intercept_magic) All scripts share a common part that will instrument the malloc/free function. The different part for the scripts is that what other functions to instrument, it is for experimental purpose, we can skip it here. Let's focus on the mallo/free. The malloc and free will be called many times. We only instrument them that are called in the ML SDK libraries. We instrument the malloc to collect the malloced buffer size and only dump the buffer when the buffer is big enough to contains a model. (buffer pointer stored in a bml, big malloc list in the javascript code). We tried different size, 500K, 1MB, you can customize it depends on the app you are analyzing, some use big models some use small models.

"In your paper, you said you tried to dump the memory which size is small or equal to the encrypted model size" A bit misunderstanding here. I just quoted part of section 5.1 from the paper:

"Naive instrumentation of deallocation APIs can lead to dramatic app slowdown. We optimize it by first only activating it after the ML library is loaded, and second, only for buffers greater than the minimum model size (a configurable threshold). To get buffer size, memory allocation APIs (e.g.,malloc) are instrumented as well. The size information also helps correlate a decrypted model to its encrypted version (discussed in §5.3)"

So we use the buffer size information to match the suspected model buffer with the actual encrypted model file.

Format

For some model buffers, we are not sure about the actual encoding format, but based on the strings information, like model layer and other meta data information(conv, relu), and the buffer size information(exactly the same the model file) we are pretty sure it's the decrypted model buffer.

Disclaimer

We do not intend to provide a powerful tool that can extract the complete encrypted models, and we do not encourage misusing this tool as it will jeopardize the model vendors. The goal of this project is to raise the awareness of the model privacy problem.

Hello, I have the same question. I am testing the scripts generated by command "-j" to dump the model. I have tried the pull_and_analyze.sh script. And I want to know what is the file pulled from the mobile phone?And I did not find this file. Is this pulled during the running of the APP?

sunzc commented 3 years ago

Thanks for your interests :)

Validate the model

Did you try the pull_and_analyze.sh script ? The output of the script is a file called pb.result, which will help you guess whether there is a buffer contains a model. For the suspected buffer, the tool will automatically generate hexdumped version of the buffer, it will help us locate the start address of the model. Unfortunately, locating the actual address is manual effort with the knowledge of protobuf encoding feature. Once you know the start address, then the extractpb.sh will help extract the protobuf part from the buffer. The output can be further verified by protobuf decoder, we use this tool https://protogen.marcgravell.com/decode, as it can decode incomplete buffer(we only dump the first 10KB of the suspected buffer).

Dump size

We dump first 10KB of big suspected buffer(size > 500KB or 1MB). Buffer start with "0x0A"(protobuf encoding signature) has a posibility of containing a model located from the begining of the buffer(usually not the case), so we dump the first few bytes for quick verification. A little bit background about how the instrumentation works: When use modelxray with -j, it will generate several version instrumentation scripts. (intercept_all, intercept_fw(framework), intercept_magic) All scripts share a common part that will instrument the malloc/free function. The different part for the scripts is that what other functions to instrument, it is for experimental purpose, we can skip it here. Let's focus on the mallo/free. The malloc and free will be called many times. We only instrument them that are called in the ML SDK libraries. We instrument the malloc to collect the malloced buffer size and only dump the buffer when the buffer is big enough to contains a model. (buffer pointer stored in a bml, big malloc list in the javascript code). We tried different size, 500K, 1MB, you can customize it depends on the app you are analyzing, some use big models some use small models. "In your paper, you said you tried to dump the memory which size is small or equal to the encrypted model size" A bit misunderstanding here. I just quoted part of section 5.1 from the paper: "Naive instrumentation of deallocation APIs can lead to dramatic app slowdown. We optimize it by first only activating it after the ML library is loaded, and second, only for buffers greater than the minimum model size (a configurable threshold). To get buffer size, memory allocation APIs (e.g.,malloc) are instrumented as well. The size information also helps correlate a decrypted model to its encrypted version (discussed in §5.3)" So we use the buffer size information to match the suspected model buffer with the actual encrypted model file.

Format

For some model buffers, we are not sure about the actual encoding format, but based on the strings information, like model layer and other meta data information(conv, relu), and the buffer size information(exactly the same the model file) we are pretty sure it's the decrypted model buffer.

Disclaimer

We do not intend to provide a powerful tool that can extract the complete encrypted models, and we do not encourage misusing this tool as it will jeopardize the model vendors. The goal of this project is to raise the awareness of the model privacy problem.

Hello, I have the same question. I am testing the scripts generated by command "-j" to dump the model. I have tried the pull_and_analyze.sh script. And I want to know what is the file pulled from the mobile phone?And I did not find this file. Is this pulled during the running of the APP?

When you test, you should get the app running, and navigate the app to use AI related functions, which will trigger model loading related code; if you are lucky, you should be able to see the suspected buffer is dumped in the log info, which shows the suspected buffer is stored on the smart phone. Then you can run the pull_and_analyze.sh to pull the suspected model buffer files onto your host for further analysis.

More about how the extractor works: we instrument the app's ML library and intercept its malloc/free. So you need to use the app and trigger the ML related code and then our instrumentation will work.

Hope it helps.