zakariamaaraki / RemoteCodeCompiler

An online code compiler supporting 11 programming languages (Java, Kotlin, Scala, C, C++, C#, Golang, Python, Ruby, Rust and Haskell) for competitive programming and coding interviews.
GNU General Public License v3.0
157 stars 52 forks source link
coding-interviews competitive-programming docker grpc java kafka kafka-streams rabbitmq serverless spring-boot stream-processing

Build and Test Environment Docker Images CI Docker Test Coverage License: GPL v3

Remote Code Compiler

UI

An online code compiler supporting 11 languages (Java, Kotlin, C, C++, C#, Golang, Python, Scala, Ruby, Rust and Haskell) for competitive programming and coding interviews. This tool execute your code remotely using docker containers to separate environments of execution.

Supported languages

Supports Rest Calls (Long Polling and Push Notification), Apache Kafka and Rabbit MQ Messages, and gRPC.

Table of Contents

Security Considerations

The compiler ensures the security of user code execution by sandboxing the execution environment and applying strict resource limits. Additionally, input sanitization and validation are performed to prevent code injection attacks.

Sandboxing

The Remote Code Compiler employs sandboxing techniques to isolate user code executions from the underlying system. Each code execution occurs within a dedicated Docker container, providing a secure and contained environment. This isolation prevents unauthorized access to system resources and protects against potential security vulnerabilities.

Resource Limits

Strict resource limits are enforced to prevent resource exhaustion attacks and ensure fair resource allocation. The compiler sets limits on CPU usage, memory consumption, and execution time for each code execution. These limits mitigate the risk of denial-of-service (DoS) attacks and ensure the stability and reliability of the compiler platform.

Input Sanitization

Input sanitization measures are implemented to validate and sanitize user inputs before execution. This helps prevent code injection attacks and ensures that only safe and trusted inputs are processed by the compiler. By sanitizing inputs, the compiler reduces the risk of executing malicious code and maintains the integrity of the execution environment.

Rate Limiting

To ensure fair usage and protect the system from abuse, a rate limit of 5 requests per second (RPS) per user ID is enforced. This rate limit can be overridden by setting the MAX_USER_REQUESTS environment variable to a different value. This flexibility allows you to adjust the rate limits based on specific needs or usage patterns.

Garbage Collection

A garbage collector runs periodically to clean up any stuck executions and their associated resources, ensuring efficient resource utilization and preventing resource leaks.

Scalability

The compiler can scale horizontally to handle increased load by deploying multiple instances behind a load balancer. Each instance is stateless and can independently process incoming requests, ensuring high availability and performance.

How It Works

When a request arrives, the compiler gets to work by creating a special container just for compiling the code you sent. This container works closely with the main application, sharing its storage space for easy access to files. Once the code is compiled successfully, the compiler sets up separate containers for running each test. These containers work independently, each having its own space to run the code without being affected by others.

Architecture

In the execution step, each container is assigned a set number of CPUs (consistent across all containers, with a recommended value of 0.1 CPUs per execution), as well as limits on memory and execution time. When the container hits either the memory threshold or the maximum time allowed, it is automatically terminated, and a user-facing error message is generated to explain the termination cause.

Benchmark Report

Overview

In our endeavor to develop a robust and efficient remote code compiler, we conducted a series of benchmark tests to evaluate the performance across multiple programming languages. The problem chosen for this benchmark is a simple problem from Codeforces: Watermelon (Problem A from Contest 4).

We executed 4000 test runs for each of 10 test cases in four different programming languages: Java, Python, C, and C++. This resulted in a total of 40000 executions and 3000 compilations.

Test Environment

The tests were conducted on a virtual machine configured with:

This setup provided a controlled environment to ensure consistency and reliability in our benchmark results.

Compilation Performance

Compilation Time Analysis

We measured the time taken to compile code for each of the four languages. Here are the key metrics:

The average compilation duration provides a reliable estimate of how long it typically takes to compile code, while the maximum and 95th percentile durations indicate the upper bounds under the test conditions.

Remark: The compilation time includes the time taken for the creation of the container.

Compilation Time Distribution

Language Max Compilation Time (s) Avg Compilation Time (s) 95th Percentile Compilation Time (s)
Java 0.950406434 0.82969625775 0.912345678
Python N/A N/A N/A
C 0.812345678 0.78912345678 0.805678912
C++ 0.834567890 0.81234567890 0.825678901

Note: Python is an interpreted language and does not require a separate compilation step, hence the N/A values.

Execution Performance

Execution Time Analysis

We measured the execution time for each test case. The execution duration reflects the time taken to execute one test case. The aggregate execution duration for all test cases was also computed to understand the overall performance.

Given that the problem have 10 test cases, the total execution duration for all test cases is:

Remark: The execution time includes the time taken for the creation of the container and running it.

Execution Time Distribution

Language Max Execution Time (s) Avg Execution Time (s) 95th Percentile Execution Time (s)
Java 0.601810192 0.52991365113 0.589123456
Python 0.701234567 0.62345678912 0.689123456
C 0.498765432 0.45678901234 0.478901234
C++ 0.512345678 0.46789012345 0.489012345

Observations and Insights

image

image

Prerequisites

To run this project you need a docker engine running on your machine.

Getting Started

1- Build a docker image:

docker image build . -t compiler

2- Create a volume:

docker volume create compiler

3- build the necessary docker images used by the compiler

./environment/build.sh

4- Run the container:

docker container run -p 8080:8082 -v /var/run/docker.sock:/var/run/docker.sock -v compiler:/compiler -e DELETE_DOCKER_IMAGE=true -e EXECUTION_MEMORY_MAX=10000 -e EXECUTION_MEMORY_MIN=0 -e EXECUTION_TIME_MAX=15 -e EXECUTION_TIME_MIN=0 -e MAX_REQUESTS=1000 -e MAX_EXECUTION_CPUS=0.2 -e COMPILATION_CONTAINER_VOLUME=compiler -t compiler

Local Run (for dev environment only)

See the documentation in the local folder, a docker-compose is provided.

docker-compose up --build

On K8s

k8s logo

You can use the provided helm chart to deploy the project on k8s, see the documentation in the k8s folder.

helm install compiler ./k8s/compiler

AKS Provisioning

We provide you with a script to provision an AKS cluster to ease your deployment experience. See the documentation in the provisioning folder.

APIs

For the Rest API documentation visit the swagger page at the following url : http:///swagger-ui.html

Compiler swagger doc

Example of a Rest call

curl  'http://localhost:8080/api/compile/json'  -X POST  -H 'Content-Type: application/json; charset=UTF-8'  --data-raw '{"sourcecode":"// Java code here\npublic class main {\n    public static void main(String[] args) {\n        System.out.println(\"NO\");\n    }\n}","language":"JAVA", "testCases": {"test1" : {"input" : "", "expectedOutput" : "NO"}}, "memoryLimit" : 500, "timeLimit": 2 }'

NOTE: The time limit in the request should be in seconds (s) and the memory limit in megabytes (MB).

Verdicts

Here is a list of Verdicts that can be returned by the compiler:

NOTE: The time limit is in milliseconds (ms) and the memory limit is in megabytes (MB).

:tada: Accepted

{
    "verdict": "Accepted",
    "statusCode": 100,
    "error": "",
    "testCasesResult": {
      "test1": {
        "verdict": "Accepted",
        "verdictStatusCode": 100,
        "output": "0 1 2 3 4 5 6 7 8 9",
        "error": "", 
        "expectedOutput": "0 1 2 3 4 5 6 7 8 9",
        "executionDuration": 175
      },
      "test2": {
        "verdict": "Accepted",
        "verdictStatusCode": 100,
        "output": "9 8 7 1",
        "error": "" ,
        "expectedOutput": "9 8 7 1",
        "executionDuration": 273
      },
      ...
    },
    "compilationDuration": 328,
    "averageExecutionDuration": 183,
    "timeLimit": 1500,
    "memoryLimit": 500,
    "language": "JAVA",
    "dateTime": "2022-01-28T23:32:02.843465"
}

:x: Wrong Answer

{
    "verdict": "Wrong Answer",
    "statusCode": 200,
    "error": "",
    "testCasesResult": {
      "test1": {
        "verdict": "Accepted",
        "verdictStatusCode": 100,
        "output": "0 1 2 3 4 5 6 7 8 9",
        "error": "", 
        "expectedOutput": "0 1 2 3 4 5 6 7 8 9",
        "executionDuration": 175
      },
      "test2": {
        "verdict": "Wrong Answer",
        "verdictStatusCode": 200,
        "output": "9 8 7 1",
        "error": "" ,
        "expectedOutput": "9 8 6 1",
        "executionDuration": 273
      }
    },
    "compilationDuration": 328,
    "averageExecutionDuration": 183,
    "timeLimit": 1500,
    "memoryLimit": 500,
    "language": "JAVA",
    "dateTime": "2022-01-28T23:32:02.843465"
}

:shit: Compilation Error

{
  "verdict": "Compilation Error",
   "statusCode": 300,
   "error": "# command-line-arguments\n./main.go:5:10: undefined: i\n./main.go:6:21: undefined: i\n./main.go:7:9: undefined: i\n",
   "testCasesResult": {},
   "compilationDuration": 118,
   "averageExecutionDuration": 0,
   "timeLimit": 1500,
   "memoryLimit": 500,
   "language": "GO",
   "dateTime": "2022-01-28T23:32:02.843465"
}

:clock130: Time Limit Exceeded

{
    "verdict": "Time Limit Exceeded",
    "statusCode": 500,
    "error": "Execution exceeded 15sec",
    "testCasesResult": {
      "test1": {
        "verdict": "Accepted",
        "verdictStatusCode": 100,
        "output": "0 1 2 3 4 5 6 7 8 9",
        "error": "", 
        "expectedOutput": "0 1 2 3 4 5 6 7 8 9",
        "executionDuration": 175
      },
      "test2": {
        "verdict": "Time Limit Exceeded",
        "verdictStatusCode": 500,
        "output": "",
        "error": "Execution exceeded 15sec" ,
        "expectedOutput": "9 8 7 1",
        "executionDuration": 1501
      }
    },
    "compilationDuration": 328,
    "averageExecutionDuration": 838,
    "timeLimit": 1500,
    "memoryLimit": 500,
    "language": "JAVA",
    "dateTime": "2022-01-28T23:32:02.843465"
}

:boom: Runtime Error

{
    "verdict": "Runtime Error",
    "statusCode": 600,
    "error": "panic: runtime error: integer divide by zero\n\ngoroutine 1 [running]:\nmain.main()\n\t/app/main.go:11 +0x9b\n",
    "testCasesResult": {
      "test1": {
        "verdict": "Accepted",
        "verdictStatusCode": 100,
        "output": "0 1 2 3 4 5 6 7 8 9",
        "error": "", 
        "expectedOutput": "0 1 2 3 4 5 6 7 8 9",
        "executionDuration": 175
      },
      "test2": {
        "verdict": "Runtime Error",
        "verdictStatusCode": 600,
        "output": "",
        "error": "panic: runtime error: integer divide by zero\n\ngoroutine 1 [running]:\nmain.main()\n\t/app/main.go:11 +0x9b\n" ,
        "expectedOutput": "9 8 7 1",
        "executionDuration": 0
      }
    },
    "compilationDuration": 328,
    "averageExecutionDuration": 175,
    "timeLimit": 1500,
    "memoryLimit": 500,
    "language": "GO",
    "dateTime": "2022-01-28T23:32:02.843465"
}

:minidisc: Out Of Memory

{
    "verdict": "Out Of Memory",
    "statusCode": 400,
    "error": "fatal error: runtime: out of memory\n\nruntime stack:\nruntime.throw({0x497d72?, 0x17487800000?})\n\t/usr/local/go/src/runtime/panic.go:992 +0x71\nruntime.sysMap(0xc000400000, 0x7ffccb36b0d0?, 0x7ffccb36b13...",
    "testCasesResult": {
      "test1": {
        "verdict": "Accepted",
        "verdictStatusCode": 100,
        "output": "0 1 2 3 4 5 6 7 8 9",
        "error": "", 
        "expectedOutput": "0 1 2 3 4 5 6 7 8 9",
        "executionDuration": 175
      },
      "test2": {
        "verdict": "Out Of Memory",
        "verdictStatusCode": 400,
        "output": "",
        "error": "fatal error: runtime: out of memory\n\nruntime stack:\nruntime.throw({0x497d72?, 0x17487800000?})\n\t/usr/local/go/src/runtime/panic.go:992 +0x71\nruntime.sysMap(0xc000400000, 0x7ffccb36b0d0?, 0x7ffccb36b13..." ,
        "expectedOutput": "9 8 7 1",
        "executionDuration": 0
      }
    },
    "compilationDuration": 328,
    "averageExecutionDuration": 175,
    "timeLimit": 1500,
    "memoryLimit": 500,
    "language": "GO",
    "dateTime": "2022-01-28T23:32:02.843465"
}

How to use it from the UI

The compiler is equipped with some problems specified in the problems.json file located in the resource folder. These problem sets are automatically loaded upon project startup, granting you the opportunity to explore and test them through the /problems endpoint.

ui-problem-page.png

Push Notifications

You may want to get the response later and to avoid http timeouts, you can use push notifications, to do so you should pass two header values (url where you want to get the response and set preferPush to prefer-push)

push-notifications.png

To enable push notifications you should set the environment variable ENABLE_PUSH_NOTIFICATION to true

Multipart request

You have also the possibility to use multipart requests, you typically can use these requests for file uploads and for transferring data of several types in a single request. The only limitation with that, is that you can specify only one test case.

multipart-request.png

With gRPC

You should first set the gRPC server port number of the Remote Code Compiler using the GRPC_PORT environment variable. This ensures that the server listens on the correct port.

To interact with the Remote Code Compiler using gRPC, follow these steps:

Step 1: Install gRPC and Protocol Buffers

Ensure you have gRPC and Protocol Buffers installed in your development environment. Instructions can be found on the official gRPC website.

Step 2: Generate gRPC Client Code

Generate the gRPC client code from the provided compiler.proto file. This can be done using the protoc compiler.

protoc --java_out=src/main/java --grpc-java_out=src/main/java -I=src/main/proto src/main/proto/compiler.proto
Step 3: Implement the gRPC Client

Here is an example of how to implement a gRPC client in Java to consume the CompilerService:

import com.cp.compiler.contract.CompilerServiceGrpc;
import com.cp.compiler.contract.RemoteCodeCompilerRequest;
import com.cp.compiler.contract.RemoteCodeCompilerResponse;
import io.grpc.ManagedChannel;
import io.grpc.ManagedChannelBuilder;

public class CompilerClient {
    public static void main(String[] args) {
        // Create a channel to connect to the server
        ManagedChannel channel = ManagedChannelBuilder.forAddress("localhost", 50051)
                .usePlaintext()
                .build();

        // Create a blocking stub to interact with the service
        CompilerServiceGrpc.CompilerServiceBlockingStub stub = CompilerServiceGrpc.newBlockingStub(channel);

        // Create a request
        RemoteCodeCompilerRequest request = RemoteCodeCompilerRequest.newBuilder()
                .setSourcecode("public class Main { public static void main(String[] args) { System.out.println(\"Hello, World!\"); } }")
                .setLanguage(RemoteCodeCompilerRequest.Language.JAVA)
                .setExpectedOutput("Hello, World!")
                .setTimeLimit(5)
                .setMemoryLimit(512)
                .build();

        // Call the service and get the response
        RemoteCodeCompilerResponse response = stub.compile(request);

        // Handle the response
        System.out.println("Verdict: " + response.getExecution().getVerdict());
        System.out.println("Status Code: " + response.getExecution().getStatusCode());

        // Close the channel
        channel.shutdown();
    }
}

Visualize Docker images and containers information

It is also possible to visualize information about the images and docker containers that are currently running using these endpoints

Docker info

Stream processing with Kafka Streams

The Remote Code Compiler integrates with Apache Kafka for stream processing. This allows for efficient handling of high-throughput data streams and real-time analytics.

remote code compiler kafka mode

Kafka Streams Toplogy

image

To enable kafka mode you should pass to the container the following env variables :

NOTE: Having More partitions => More Parallelism => Better performance

docker container run -p 8080:8082 -v /var/run/docker.sock:/var/run/docker.sock -e DELETE_DOCKER_IMAGE=true -e EXECUTION_MEMORY_MAX=10000 -e EXECUTION_MEMORY_MIN=0 -e EXECUTION_TIME_MAX=15 -e EXECUTION_TIME_MIN=0 -e ENABLE_KAFKA_MODE=true -e KAFKA_INPUT_TOPIC=topic.input -e KAFKA_OUTPUT_TOPIC=topic.output -e KAFKA_CONSUMER_GROUP_ID=compilerId -e KAFKA_HOSTS=ip_broker1,ip_broker2,ip_broker3 -e API_KEY=YOUR_API_KEY -e API_SECRET=YOUR_API_SECRET -t compiler

Queueing system with RabbitMq

The Remote Code Compiler integrates with RabbitMq for queueing. This allows for efficient handling of high-throughput data.

remote code compiler rabbitMq mode

To enable Rabbit MQ mode you should pass to the container the following env variables :

docker container run -p 8080:8082 -v /var/run/docker.sock:/var/run/docker.sock -e DELETE_DOCKER_IMAGE=true -e EXECUTION_MEMORY_MAX=10000 -e EXECUTION_MEMORY_MIN=0 -e EXECUTION_TIME_MAX=15 -e EXECUTION_TIME_MIN=0 -e ENABLE_RABBITMQ_MODE=true -e RABBIT_QUEUE_INPUT=queue.input -e RABBIT_QUEUE_OUTPUT=queue.output -e RABBIT_USERNAME=username -e RABBIT_PASSWORD=password -e RABBIT_HOSTS=ip_broker1,ip_broker2,ip_broker3 -t compiler

Monitoring

prometheus logo

Monitoring is crucial for maintaining the health and performance of the Remote Code Compiler. The application includes monitoring tools to track resource usage, execution metrics, and system logs. Use tools like Prometheus and Grafana for comprehensive monitoring and visualization.

Check out exposed prometheus metrics using the following url : http:///prometheus

Logging

By default, only console logging is enabled.

You can store logs in a file and access to it using /logfile endpoint by setting the environment variable ROLLING_FILE_LOGGING to true. All logs will be kept for 7 days with a maximum size of 1 GB.

logstash logo

You can also send logs to logstash pipeline by setting these environment variables LOGSTASH_LOGGING to true and LOGSTASH_SERVER_HOST, LOGSTASH_SERVER_PORT to logstash and port values respectively.

Getting Help

If you encounter any issues or need assistance with the Remote Code Compiler, feel free to reach out for support. You can:

Author