Closed sreeder closed 7 years ago
@sreeder actually I'm getting the same error for both of these The only one I got working so far is a call like http://127.0.0.1:8080/odm2timeseries/rest/1_1/GetSites?site=odm2timeseries:Rio%20Icacos%20Trib-IO
Here is a commit I added to my fork https://github.com/miguelcleon/WOFpy/commit/f53476b7edf5b335b808e5cbc2f45b7a9ef0beab
Just wondering the status of this issue. @miguelcleon, are you saying the commit you added on your fork resolved the GROUP BY problems for you?
I'm somewhat confused b/c I remember seeing many other exchanges on this issue last week; I assume they've been deleted? Some were referring to changes Miguel thought he'd made, but turned out to be already in WOFpy.
@sreeder, I assume you haven't had a chance to look further into this. And @lsetiawan doesn't have access to the EnviroDIY PostgreSQL database Stephanie is testing (AFAIK), so he wouldn't have been able to look into this.
Thanks, all.
@emiliom The commit I added in my fork only resolved the group by error for the GetSites?site=
type call. It isn't terribly useful. Yeah, I deleted out some of the comments because they didn't seem helpful.
The commit I added in my fork only resolved the group by error for the GetSites?site= type call. It isn't terribly useful.
Well, that's still progress! Thanks. Are you saying that a GetSites
request w/o the site=
argument still returns a GROUP BY error?
Yes, the same error Stephanie posted here.
@emiliom @lsetiawan Would it be enough to give you access to the server that we are trying to host wofpy on? If not I can work on creating a copy of the live database onto that server that you guys could have access too. The issue with giving you access to the live database is that there are multiple firewalls we don't control, it is easier for us to make a copy and move into a demilitarized zone.
@sreeder if that server has the connection to the database, and we get the credential to the database, then that would work. Thanks.
Wouldn't it be easier (for Don and I) if we could deploy the postgresql database on our local machines, for testing? We would not expose anything to the internet, of course. It would just be to minimize hassles and lags during testing and bug hunting.
@sreeder, that would just mean you sending us a postgresql dump file that we can then restore into a working database.
@emiliom that is a great idea. I have create the postgresql dump file and added it to the ODM2 Google drive folder, under ODM2_db_scripts. Let me know if you have any troubles finding it.
Thanks, @sreeder! I was able to download it. @lsetiawan and I will let you know if we have any issues restoring it into Postgresql.
@sreeder What postgresql and postgis versions are you using? Thanks
Also, I am currently having ROLE
issues. Please dump without any role/owner associations Thanks.
pg_restore: [archiver (db)] Error from TOC entry 427; 1255 37207 FUNCTION reset_sequence(text, text, text) admin_1
pg_restore: [archiver (db)] could not execute query: ERROR: role "admin_1" does not exist
Command was: ALTER FUNCTION public.reset_sequence(tablename text, columnname text, sequence_name text) OWNER TO admin_1;
pg_restore: [archiver (db)] Error from TOC entry 404; 1259 37184 TABLE django_migrations websdl_user
pg_restore: [archiver (db)] could not execute query: ERROR: role "websdl_user" does not exist
Command was: ALTER TABLE django_migrations OWNER TO websdl_user;
@lsetiawan we are using Postrgresql version 9.5.6. I don't believe he actually has postgis installed. and I have updated the script on google drive without the role information.
Thanks, @sreeder! FYI, @lsetiawan will work on this tomorrow; he usually works on a different project on Thursdays.
@sreeder There doesn't appear to be an odm2
schema in your dump, or am I missing something. Thanks.
You're right, @lsetiawan, there's supposed to be an odm2
schema. You don't necessarily show the entire content of the Schemas object in your screenshot, but after some checking with PgAdmin, I would expect schemas to be presented in alphabetical order, so it should be before pg_catalog
(BTW, is this a screenshot from PgAdmin or DBeaver? I suspect it's the latter. But I assume you're used to seeing ODM2 postgresql DB's in DBeaver, from the LCZO and CZIMEA DB's).
On the other hand, I'd be surprised that @sreeder hasn't run into problems if she doesn't actually have an odm2
schema ...
@emiliom @lsetiawan so sorry about that guys, I didn't realize he sent me two files, and of course I uploaded our Django one, instead of the ODM2. I have replaced the incorrect file on google drive with the correct odm2 schema file
@sreeder No problem. Thanks for replacement! All works now. I have an odm2
database with odm2
schema with their appropriate tables! 😸
Now off to testing!
Awesome! Thanks, @lsetiawan and @sreeder!
@lsetiawan I was running some unrelated SQL queries against a PostgreSQL database and I happened to include an order by clause where one was inappropriate. I got a similar GROUP BY error. Seems like this could happen in general when the SQL is malformed.
Seems like it would be a good idea to run the SQL commands directly against PostgreSQL and tweak them, then try and translate the changes back into SQLAlchemy.
Maybe this helps?
@lsetiawan I was running some unrelated SQL queries against a PostgreSQL database and I happened to include an order by clause where one was inappropriate. I got a similar GROUP BY error. Seems like this could happen in general when the SQL is malformed.
Yeah, but the weird thing is that the SQLAlchemy code does work with MySQL and SQLite! You'd think GROUP BY errors would be basic SQL issues that would work and fail the same across RDBMS's; but you never know.
Seems like it would be a good idea to run the SQL commands directly against PostgreSQL and tweak them, then try and translate the changes back into SQLAlchemy.
I think @lsetiawan will first test the offending SQLAlchemy code in isolation, outside wofpy. If he's still having problems, I'll work with him to translate that code into equivalent SQL (I'm a more experienced SQL coder than Don is :stuck_out_tongue_winking_eye: )
Yeah, but the weird thing is that the SQLAlchemy code does work with MySQL and SQLite! You'd think GROUP BY errors would be basic SQL issues that would work and fail the same across RDBMS's; but you never know.
Yes, I would think so, I guess PostgreSQL tends to be a bit less forgiving then other RDBMS's :persevere:
I think @lsetiawan will first test the offending SQLAlchemy code in isolation, outside wofpy.
This is exactly what I'm doing now 😄
I guess PostgreSQL tends to be a bit less forgiving then other RDBMS's
Good point.
I think @lsetiawan will first test the offending SQLAlchemy code in isolation, outside wofpy.
This is exactly what I'm doing now
Great!
Looking at the get_all_sites
in the DAO, this is where the problem lies for GetSites
query.
I modified the query a bit to make it work and here's what I found:
Sites
and TimeSeriesResults
are polymorphic tables, I only put odm2_models.Sites
rather than odm2_models.Sites.SamplingFeatureID
and odm2_models.SamplingFeatures.SamplingFeatureID
; same case with TimeSeries Results
sites1 = db_session.query(odm2_models.Sites, odm2_models.TimeSeriesResults). \
join(odm2_models.FeatureActions). \
filter(odm2_models.FeatureActions.SamplingFeatureID == odm2_models.Sites.SamplingFeatureID,
odm2_models.TimeSeriesResults.FeatureActionID == odm2_models.FeatureActions.FeatureActionID). \
group_by(odm2_models.Sites, odm2_models.TimeSeriesResults).all()
Result:
>>> len(sites1)
>>> 249
Seems like it just queried everything, group_by
didn't work since it's grouping by multiple tables.
TimeSeriesResults
as part of the JOIN
sites2 = db_session.query(odm2_models.Sites). \
join(odm2_models.FeatureActions). \
join(odm2_models.TimeSeriesResults). \
filter(odm2_models.FeatureActions.SamplingFeatureID == odm2_models.Sites.SamplingFeatureID,
odm2_models.TimeSeriesResults.FeatureActionID == odm2_models.FeatureActions.FeatureActionID). \
group_by(odm2_models.Sites).all()
Result:
>>> len(sites2)
>>> 42
This looks more promising, the query actually grouped by Sites, but now TimeSeriesResults
isn't part of the resulting query. But then going back to https://github.com/ODM2/WOFpy/issues/116
Nice work. Let's look at it together on Monday. I'll be able to compose equivalent SQL queries then, to compare.
What happens when you run the second SQLAlchemy query type ("Second Finding") against the MySQL LBR DB? Does it work, and do you get the same number of sites as in the original query?
I'm guessing that if you run the first SQLAlchemy query type ("First Finding") against that DB, you'll get a much larger number of sites (due to duplication).
My guess is that your second query is the way to go. And that's the query type we had investigated a couple of weeks ago, though not with postgresql. If it works with MySQL and SQLite, we'll just have to make some small tweaks to the code in get_all_sites
that handles the response, I think.
@emiliom @lsetiawan . Hey, I am not understanding why we are trying to get the TimeSeriesResults back when we are searching for all of the sites. and why are we doing a group by and not just trying to get all of the DISTINCT sites? Also, it would probably be easiest to query the SamplingFeature object and filter by type of "Site".
@sreeder, we'll explain better on Monday. Right now we weren't trying to explain it to a wider audience :wink:
Very brief answers:
why we are trying to get the TimeSeriesResults back when we are searching for all of the sites
Because that's how the code was set up before we started working with it, partly b/c it assumes that the ODM2 database could have different result types and this DAO is intended to only return sites that have timeseries results.
and why are we doing a group by and not just trying to get all of the DISTINCT sites
We haven't explored DISTINCT in this context, but it's probably fine and may have better performance.
@emiliom @lsetiawan are you using the API read_service calls or generating your own? Have you tried
read.getSamplingFeatures(type = "Site", results = True)
That should get you all of the Sampling features of type site, where results exist. From the sample database I sent you there are 42 SamplingFeatures returned..
@emiliom
What happens when you run the second SQLAlchemy query type ("Second Finding") against the MySQL LBR DB? Does it work, and do you get the same number of sites as in the original query?
I get same number of sites in the original query.
I'm guessing that if you run the first SQLAlchemy query type ("First Finding") against that DB, you'll get a much larger number of sites (due to duplication).
Indeed. This is correct.
@emiliom @lsetiawan are you using the API read_service calls or generating your own?
Not at this time. See the comment I just added here, in response to another comment of yours.
Have you tried read.getSamplingFeatures(type = "Site", results = True)
That should get you all of the Sampling features of type site, where results exist. From the sample database I sent you there are 42 SamplingFeatures returned..
Besides not using odm2api querying at this time, that would not work generically b/c, as I mentioned earlier, this DAO (odm2 timeseries) is targeted at TimeSeries result types only. It would work in a database that has no other result types (such as EnviroDIY?), but we want it to work generically, including in @miguelcleon 's Luquillo CZO postgresql database which stores different result types.
BTW, DAO's can be modified by users. So, you may find that you can optimize WOFpy (for performance and maintenance) with your ODM2 database in the future, by editing the built-in odm2 timeseries DAO. We can talk about this once we have wofpy working with your postgresql DB, and specially after we've issued a new release and conda package (hopefully next week).
@emiliom In that case, should I use the "Second Finding" way of querying? getting rid of querying timeseriesresults and just put it as part of the join? Thanks.
In that case, should I use the "Second Finding" way of querying? getting rid of querying timeseriesresults and just put it as part of the join?
Yup. But remember you'll probably need to edit the block at odm2_timeseries_dao.py#L51 too.
@emiliom Roger that! Thanks. 👍
I am getting a Group by error when connecting to a Non-LBR ODM2 postgres database.
@emiliom @miguelcleon @lsetiawan