Open lib-j opened 4 years ago
What do you mean "breaking python conventions over class attributes"?
You should use a pattern like:
from astroquery.gaia import Gaia
Gaia.ROW_LIMIT = 1000
Gaia.query...
in the present implementation.
I think I follow the general premise here, and it would be nice to have Gaia behave like Vizier, e.g.:
from astroquery.vizier import Vizier
vv = Vizier(row_limit=100)
vv.query...
but that is presently not implemented.
Note that we will not consider changes to set defaults to "unlimited", as that is more likely to result in dangerous undesired requests (the worst case might be requesting terabytes of data unwittingly and the server not rejecting the request; the best case is the server rejecting the request)
I understand the worries about large queries overloading the servers. However at minimum it should be made clear to the user in the documentation that a default is set and how to override this easily. The current default number being limited to 50 is far too small for general users.
This made understanding the results from your documented examples harder to understand as they return more sources than the default size.
Regarding python conventions PEP8 states that constants are the only type of variable that should be fully capitalised, as shown by https://www.python.org/dev/peps/pep-0008/#constants. These should not be generally changed by users, and should be only set during builds. Hence my confusion when this constant attribute had to be changed to return queries with result sizes over 50. Like almost all Gaia queries.
OK, there's a very good point here that the documentation has not kept up with code changes. There should have been accompanying documentation changes in #1641 that were left out.
Re: constants, it's not clear to me that we violate PEP8; these are class attributes, not global constants. I've seen class-level constants used in this way in other parts of the python ecosystem, but if we are in violation of a specific part of the PEP, I think a refactor and eventual deprecation would be in order. @bsipocz @ceb8 , could you chime in on this?
@lib-j thanks for raising these points, they're important and it's likely we simply overlooked recommendations in the original design and clearly had some oversights in recent changes. We welcome documentation fix PRs, if you're willing. We'd also welcome help on refactoring to meet PEP requirements, if that's the path we decide on (but let's discuss a little more before anyone takes the time to make a sweeping change).
I wasn't intending to do any documentation PR's; I was only looking through the code to try an understand why my search results were not producing the expected values or the same as what the examples returned, hence raising this issue to prevent other users loosing time trying to figure this out.
On the class attributes they are commonly not capitalised as stated in the PEP8 link.
Cheers for looking into this for me and responding so quickly.
While the capital letter convention does not strictly follow pep8 it is common enough throughout the python ecosystem that I am comfortable leaving it as is. Strict best practice would be to add a getter, but that also seems minor.
The lack of documentation is important, and ideally the function would produce a warning if results have been truncated due to hitting the row limit.
To summarize the current state of this issue:
ROW_LIMIT
ROW_LIMIT
would be helpful
The GaiaClass.ROW_LIMIT put in from this PR: https://github.com/astropy/astroquery/pull/1641
Now limits all queries to return only the top 50 objects. There is no way to adjust this row limit without breaking python conventions over class attributes.
A better solution would be to use this attribute (set to unlimited) as a default to all the search methods, with a parameter to these methods for overriding the default.