Closed patrickleclair-GORDONFN closed 3 years ago
Thanks for the bug report! Taking a look now...
Which domain are you connecting to? Can you provide the client setup code so that I can try to reproduce the issue?
I googled the dataset identifier and it led me to this site (which currently appears to be down): https://data.calgary.ca/d/y8as-bmzj
The code I used was:
client = Socrata("data.calgary.ca",app_token="")
results = client.get_all("y8as-bmzj")
Alright, I'm able to query this domain now. Let me try to reproduce. Something else I did notice though is that above you put the response in results
, but then iterate over an undefined data
variable. Could this be causing the bug?
I'm sorry, I wasn't able to reproduce the problem that you're having. Here is the code that I used.
from sodapy import Socrata
from csv import DictWriter
client = Socrata("data.calgary.ca",app_token="")
results = client.get_all("y8as-bmzj")
with open("coc_test.csv", "w", newline="", encoding="utf-8") as line_file:
csv_writer = DictWriter(line_file, fieldnames=["id","sample_site","sample_date",
"parameter","numeric_result","result_qualifier",
"formatted_result","result_units","latitude_degrees","longitude_degrees","site_key"])
for row in results:
csv_writer.writerow(row)
After this finished running, it correctly produced a file with 293,039 unique rows.
$ uniq -u coc_test.csv | wc -l
293039
I'm going to close this issue. Please feel free to re-open if I missed something.
After running:
I was receiving different final files each time. Sometimes these would include duplicate rows, other times I would have rows missing. I reran my code using pagination over the OData API instead with no issues. From my debugging the only thing I could figure was that the
get_all()
function wasn't returning the data correctly