Closed sgt1796 closed 2 months ago
I'm pretty sure you can pass an absolute path into load_dotenv()
which should solve the issue, I'll take a look
Fixed. load_env()
moved to main(), client creation moved to init()
.
GPT_embedding will now initalize new client upon initialization of Pooling
# the api key is loaded by load.env(args.env) at main()
def init(count, chunks, embedding_model):
global counter, nchunks, EMBEDDING_MODEL, client
counter, nchunks = count, chunks
EMBEDDING_MODEL = embedding_model
# initialize a new OpenAI client for the workers
client = OpenAI()
--env
option is added for CLI.
More detail for the change: c4e32e029ff0595394b6cad37a352d8cbac68d0d
This issue should be resolved before #8
GPT_embedding.py cannot handle
.env
that's not in the same folderthe
.env
cannot be store in docker image, it has to be mounted via -v option of docker run. this means the .env file will not be under the same folder as the scripts are.By default,
load_env()
looks for .env at current folder and load it at the begining of the code and then create the client.To add user customized
--env
means this declaration have to move from beginning to themain()
-- which will causing the client to be localAttempt 1
Declare
client
inmain()
, useinit()
to initializeclient
globally for each processThis will causing
client
to be duplicate and causes error.Attempt 2
Pass
client
as a parameter to the required functionThis won't work and raise error
The error occurs because the OpenAI
client
object contains a_thread.RLock
object, which cannot be pickled. The multiprocessing library in uses pickling to pass objects between processes, and thus it fails when it encounters theRLock
.