Open christad92 opened 7 months ago
This probably needs to wait until https://github.com/OpenFn/gen/issues/42 is ready before we can take contributions.
Alternatively, work could start in a standalone repo and then ported into to the server
Do not ask process related questions about how to apply and who to contact in the above ticket. The only questions allowed are about technical aspects of the project itself. If you want help with the process, you can refer instructions listed on Unstop and any further queries can be taken up on our Discord channel titled DMP queries. Here's a Video Tutorial on how to submit a proposal for a project.
Hey mentors, please do assign this job to me . I am also bit confused regarding the use of AI in this project
Hello @christad92, Can I work on this issue? . I have strong experience with javascript and I have worked on Large language models.Please let me know.
@josephjclark Iam highly interested in this project ,These were some of my contributions done to a gsoc organisation https://github.com/sugarlabs/musicblocks/commits/master/?author=falgun143.
Can you please tell me how should I proceed. I cloned the repo and ran the command pnpm run setup and I see the below error. Should I first create an image using docker and then run the command
Hi @falgun143, thanks for your interest in this project. To be considered as the contributor, please apply through the unstop platform. the mentors will then shortlist best proposals and select a contributor.
I hope this helps?
@christad92 Is there any channel in the discord under Code4GovTech ? ,I am not able to find .If there is a channel please let me know.
@falgun143 I can confirm that and get back to you.
@josephjclark Iam highly interested in this project ,These were some of my contributions done to a gsoc organisation https://github.com/sugarlabs/musicblocks/commits/master/?author=falgun143.
Can you please tell me how should I proceed. I cloned the repo and ran the command pnpm run setup and I see the below error. Should I first create an image using docker and then run the command
@christad92 Can you help me with this. And also is it necessary to complete all the github classroom assignment or is it good to understand more about the project and submit the proposal .And any updates on the discord channel?
Greetings, @christad92 , I want to contribute my sincere interest in the Development of this project & I can assure you about giving my best dedication in the development of project with my Graphic designing and UI/UX designing, front-end development skills in ReactJS and JavaScript, coupled with a passion for creating intuitive user experiences. My technical expertise, combined with a keen eye for design and functionality, positions me well to contribute effectively to the development of this project."
These are the approaches founded by me :
Scripting Approach: This approach is paramount due to its ability to provide precise control over the generation process. By writing custom scripts, you can tailor the file generation logic to match the specific requirements of the task. This approach is highly adaptable and scalable, making it well-suited for handling diverse input-output scenarios efficiently.
Graph-Based Approach: Represent the input-output relationships as a graph, where nodes represent the sample inputs and desired output, and edges denote the transformations between them. Use graph algorithms to traverse the graph and generate the job expression.js file based on the discovered paths and transformations.
Evolutionary Algorithm Approach: Employ evolutionary algorithms, such as genetic algorithms or genetic programming, to evolve candidate solutions for the job expression.js file. Represent potential solutions as individuals in a population, and iteratively apply genetic operators (e.g., mutation, crossover) to produce offspring with improved fitness. Evaluate the fitness of each individual based on its ability to match the provided sample inputs and desired output, ultimately generating a high-quality job expression.js file.
Here is my Resume : https://drive.google.com/file/d/1e4cOxVAfIjehLf7LemzX4oxPFhWd4y4D/view?usp=drive_link
Hello, My name is Debajyoti Ghosh. I am a Jr. Frontend Developer (Fresher). I have studied the project description and I am sharing my opinion to achieve it. To implement this project concept, we can utilize a combination of natural language processing (NLP) techniques along with code generation capabilities. Here's a high-level solution outline: Solution Outline:
By following this approach, we can create a robust solution that automates the generation of job expressions based on sample inputs and text instructions, thereby enhancing the efficiency of workflow development in OpenFn.
Thank You. DEBAJYOTI GHOSH
Hello @christad92 , The links for "how to write jobs?" and "writing jobs" in under documentation sub-heading are not working.
I want to know the convention of writing jobs. Can you please help me.
@REC-1104 the link has been fixed. Thank you
Hello @christad92 I have mailed you my proposal ,Please have a look at it let me know any changes so that I can finally submit it on the unstop website. @josephjclark I didn't found your email any where, Can you please send me your email? .So that I can send my proposal to you for review.
Greetings @christad92, I've sent you my proposal via email. Could you please take a moment to review it and provide any feedback or suggestions for improvement. Once finalized, I'll be ready to submit it on the website.
Hey @josephjclark @christad92, I hope you're both doing well. I wanted to inform you that I have been selected for this DMP project. Over the past few days, I've been contemplating various approaches to solve this, and one of the most feasible options seems to be leveraging the existing Apollo (formerly Gen) repository. I noticed that you have set up the initial framework for making calls to Apollo services through the CLI.
To move forward, I propose creating a service for job generation using the following inputs. Users would provide these inputs via a .json
file:
{
"api_key": "apiKey",
"adaptor": "@openfn/language-dhis2@4.0.3",
"data": {
"name": "bukayo saka",
"gender": "male"
},
"signature": "Create a new trackedEntityInstance 'person' in dhis2 for the 'dWOAzMcK2Wt' orgUnit."
}
The CLI command openfn apollo job_expression_generator tmp/data.json -o tmp/output.json
would then be used to call the job generation service on the Apollo server and return the desired result.
For job generation on the server, we can create a job_expression_generator service. This service would parse inputs from the .json
file and generate the required output. Below is a sample implementation:
from util import DictObj, createLogger
from .utils import (
generate_job_prompt,
)
from inference import inference
logger = createLogger("job_expression_generator")
class Payload(DictObj):
api_key: str
adaptor: str
signature: str
data: dict
# Generate job expression based on the input data, adaptor specification, and instructions
def main(dataDict) -> str:
data = Payload(dataDict)
logger.info("Running job expression generator with adaptor {}".format(data.adaptor))
result = generate(data.adaptor_spec, data.instructions, data.sample_input, data.get("api_key"))
logger.success("Job expression generation complete!")
return result
def generate(adaptor_spec, instructions, sample_input, key) -> str:
prompt = generate_job_prompt(adaptor_spec, instructions, sample_input)
result = inference.generate("gpt3_turbo", prompt, {"key": key})
return result
The prompt for this might look like:
prompts = {
"job_expression": (
"You are a helpful Javascript code assistant.",
"Below is a description of a task along with the adaptor specification and sample input data. "
"Generate a JavaScript job expression that performs the task described. Ensure the job expression "
"follows the conventions defined in the adaptor documentation.\n\n"
"Adaptor: {adaptor}\n"
"Instructions: {signature}\n"
"Sample Input: {sample_input}\n"
"====",
),
}
For testing, we can run this with sample inputs from the CLI, write tests in the Apollo repo itself, or both.
I believe this approach aligns with what you're looking for. Could you please provide feedback on whether I am on the right track or suggest any improvements? Your guidance would be greatly appreciated.
PS: Apologies for bringing this up here, but I'm encountering some issues with setting up the Apollo repo. I've used Docker to set up the repo locally for now, but the conventional path throws errors related to $PATH
not being found (for poetry). Apart from this, I would love to contribute towards the development of the apollo services.
Best regards
Hi @SatyamMattoo
Sorry for the late reply - and congrats! I'm delighted you've been chosen.
Unfortunately I'm tied up with various things this week and I can't get back to you right away. As you've seen a few things have changed since we put the issue up!
We're going to set up a kick off call late next week to go through this. I need to do a bit of planning beforehand. We'll be in touch soon to get that arranged, then we can let you loose!
echo
, to test the functionality of the CLI.@SatyamMattoo Did you get your local environment set up?
I don't think you should be using the docker build locally, that's just going to make life hard for yourself. Just install poetry etc on your machine, per the instructions, and you'll have a much better dev experience.
You'll be the first person, so far as I know, to setup and run apollo
locally. So any feedback on the documentation and getting started stuff would be much appreciated. Please raise issues (or even PRs) over there for anything you struggle with!
Hey @josephjclark,
I attempted to set it up locally following the provided steps, but I encountered an issue where the virtual environment could not locate the $PATH
to Poetry. After explicitly setting the path, it was unable to find the Python command. Both Poetry and Python are installed on my system and added to the virtual environment.
Despite following all the steps mentioned, there might be something I am missing. After spending two days trying to resolve these errors, I decided to use Docker bind mounts. It is working fine with Docker. While debugging, I noticed that there might be something missing in the documentation regarding the ENV PATH we add during the Dockerization of the repository.
Here is a screenshot of the error I received after running openfn apollo echo ./tmp/input.json -o output.json --local
:
@SatyamMattoo What do you mean by "added to the virtual environment"? What operating system are you using?
What does poetry --version
return from inside the apollo repo?
Can you run bun py echo tmp/test.json
? You may need a simple json file at tmp/test.json, but a file not found error would suggest that your poetry installation is working
Hey @josephjclark,
I am using Ubuntu. I meant the .venv/bin
folder contains the python dependency yet it is unable to find the python command.
Running poetry --version
returns me the current version of poetry as follows:
Running bun py echo tmp/test.json
returns me:
Hmm. What version of Ubuntu? I updated to 24.04 on Thursday evening and I seem to have a similar error now. All was working on Thursday afternoon...
I've just force reinstalled poetry and after re-running poetry
install my setup is working (after reporting a broken install).
Looking at your error again it's coming out of bash trying to execute poetry run
. It's like there's something funny with your bash environment. Nothing to do with the venv.
Can you run:
poetry run python services/entry.py echo tmp/test.json
?
You might get a list out of range exception but that would mean your environment is working (and I'm about to merge a fix to that into main)
I am using Ubuntu 22.04.4 LTS. The command is working as expected.
Okay I will try installing everything and setup the repo again and see if it fixes the issue. Do you think this might be due to a different version of Ubuntu? If so, I will update it.
No, I don't think it's related to the Ubuntu version. The upgrade broke me setup and I wondered if you'd also updated.
The problem seems to be in the bun environment. When running a bun script, bun is invoking bash and bash can't find poetry. But if you run those same commands directly, they work.
So it's something in the bun setup. Or perhaps your shell I suppose. What shell are you using? Anything strange in your setup?
To prove it, you can add a bun script to package json which just calls poetry --version
, and I expect it to fail.
Thank you @josephjclark! You were absolutely right; the error was in the bun setup. The Bun documentation does not mention adding of these env variables in the .bashrc
:
export BUN_INSTALL= "$HOME/.bun"
export PATH= "$BUN_INSTALL/bin:$PATH"
After adding them the commands run absolutely as expected. Thank you again for your assistance.
How strange!
I'll add a note in the readme about this in case it helps someone out.
Overview
When building workflows, users spend most of their time writing simple to advanced jobs on OpenFn. We'd like to harness AI to write job expressions based on english text requirements
This feature is able to generate a job expression given some sample input data, the adaptor specification (name and version), and a text description of the desired output/instruction. It is expected that the generated job expression can be executed in the CLI.
Deliverables
We are looking for a new python module to be implemented on apollo. This can be called from the existing
openfn apollo
command, but has no direct CLI implementation.Inputs
We expect the following JSON payload to be submitted to the endpoint:
All of these inputs should be considered optional - if any are excluded, the generation should continue as best it can.
We may also require metadata to be included with this request - such as which model to use, perhaps parameters to drive the model, and an api key.
Output
The service should return an expression nested in a JSON object
Note that the expression string should be pure code, suitable for inserting into a code editor. No natural language, no markdown annotations of any kind.
Implementation Notes
Highly valuable to use, and likely critical to this work, are the following two issues:
Once a basic job expression generator has been created, it may be wise to implement and integrate these issues.
Sample Inputs
One or more sample inputs (valid JSON) which can serve as the initial
state
for the jobThe adaptor specification will be in the form of “@openfn/language-dhis2@1.2.3” or “@openfn/language-common@1.2.3”
Sample Output
Given the inputs above, we'd expect the output code to be:
Background
OpenFn is an open source platform for data integration and workflow automation accessible to users thorough a CLI or a web UI.
To use OpenFn, users build
workflows
which are made up of one or more steps—at the time of writing these are all JavaScript-based "jobs" (the JS code itself is called a "job expression"). Thesejobs
make use of adaptors to perform their tasks, e.g. Make a request to an API endpoint, update a record in a database, aggregate data, send data to an external platform.Here is an example of a job that uses the common adaptor to transform an input data (in state) into a new object transformedPatient
And here is another job expression that uses the dhis2 adaptor to create a new patient record referred to as trackedEntityInstance.
Learn more about adaptors here and how they are used in OpenFn workflows.
Documentation: