Alekh-sinha / Generative-AI-QA-Model

1 stars 0 forks source link

Google default credentials were not found #1

Open HripsimeS opened 2 months ago

HripsimeS commented 2 months ago

@Alekh-sinha Hello,

I am trying to test your project. I changed URL path on the line 67 of web_extraction.py file, but when I run it on command line I get the following error.

google.auth.exceptions.DefaultCredentialsError: Your default credentials were not found.

Do I still to make more modifications in the script or I need to set up Application Default Credentials? If possible, can please also send the URL path you used in web_extraction.py file

Alekh-sinha commented 2 months ago

@HripsimeS Lines 83-85 of this code, transfers extracted web data to GCP cloud bucket for which your own credentials and is required but if you don't want to use cloud bucket, you can simply comment these lines and run it again. Extracted file will be saved in your local drive. I am talking about these lines of code

client = storage.Client() bucket = client.get_bucket(CFG.bucket) bucket.blob(CFG.path+'/'+filename).upload_from_string(df.to_csv(), 'text/csv')

HripsimeS commented 2 months ago

@Alekh-sinha thank you, it helped to fix that particular issue with the credentials error. In web_extraction.py file I modified the URL path this way to be able to read the whole path direction.

base_url = "https://www.ibm.com" relative_url = "/topics/large-language-models/" URL = base_url + relative_url

By executing the web_extraction.py file, as an outcome I got csv file where I have these information extracted.

image

Then when I execute rag.py file, I got the IndexError: list index out of range. You can see it below

image

Do you have any ideas what is going wrong and why I get that issue with the URL I used? Thanks in advance!

Alekh-sinha commented 2 months ago

@HripsimeS I have changed the code and now it should work for you. Basically this error is because of its inability to find the csv file generated by the web_extraction.py. I have defined a working directory now where I have saved everything and now code should work for you.

HripsimeS commented 2 months ago

@Alekh-sinha I tested again web_extraction.py script with my URL path: https://www.ibm.com/topics/large-language-models

working_dir folder is created, where there is a csv file with the following information.

image

Alekh-sinha commented 2 months ago

@HripsimeS and is there any error associated with it? I mean it will extract text from html and store it in single index.

HripsimeS commented 2 months ago

@Alekh-sinha no I don't get any errors, after execution of web_extraction.py script working_dir folder is creating where there is a csv file with the information I shared above.