Closed jreadey closed 6 months ago
Here is a list of cases where either HSDS or the client can attempt to use an overlong URL due to attributes or links having long names:
Client request is safe, but can cause HSDS to use an overlong URL
POST_Dataset
, dset_sn.py:1182 -> PUT_Link
POST_Datatype
, ctype_sn.py:238 -> PUT_Link
POST_Group
, group_sn.py:238 -> PUT_Link
Both client request and resultant HSDS request can use an overlong URL
PUT_Link
, link_sn.py:286 -> PUT_Link
GET_Link
, link_sn.py:158 -> GET_Link
DELETE_Link
, link_sn.py:363 -> DELETE_Link
GET_Attribute
, attr_sn.py:183 -> GET_Attribute
GET_AttributeValue
, attr_sn.py:514, -> GET_Attribute
PUT_Attribute
, attr_sn.py:390 -> PUT_Attribute
DELETE_Attribute
, attr_sn.py:451, -> DELETE_Attribute
PUT_AttributeValue
, attr_sn.py:642, attr_sn.py:750 -> GET_Attribute
, PUT_Attribute
These API endpoints expect a potentially overlong h5path
query string in the URL, and then make a potentially overlong GET_Link
request in getObjectIdByPath
GET_Domain
GET_Datatype
GET_Group
GET_Dataset
POST_Attributes
and PUT_Attributes
as implemented in this PR would provide a safe alternative to the GET_Attribute
/GET_AttributeValue
/PUT_Attribute
/PUT_AttributeValue
API endpoints by storing long attribute names in the request body.
The HDF5 Library doesn't formally specify an attribute name limit, and only attribute names up to 255 bytes are tested. For this reason, I think we can safely consider attribute names that cause a URL to exceed aiohttp's default limit (8190 bytes) to be unsupported. In that case, we wouldn't need to worry about changing DELETE_Attribute
.
The changes that would need to be made to avoid all other potentially overlong URLs:
PUT_Link
should take link name in the request body instead of the query string.DELETE_Link
would need to be implemented under POST_Link or PUT_Link.GET_*
would need to be implemented under POST_*
or PUT_*
for links, groups, datasets, datatypes, and domains.The latter two changes don't seem like perfect ideas - moving all the GET functionality under POST/PUT could be be unintuitive or duplicate behavior. Some other possible solutions to handle the long URL cases in GET_*
/DELETE_*
endpoints:
aiohttp
, and hope that intermediaries between HSDS and the client accept a URL up to ~67 Kb.At this point, there are four different ways to "get the value of an attribute" - GET_Attribute
, GET_Attributes
, GET_AttributeValue
, and POST_Attributes
, and two different ways to get the values of multiple attributes. Is there a motive for each of these different methods to continue to exist beyond backwards compatibility?
Add ability to get multiple attributes via PUT attributes or write multiple attributes with POST attributes.