Open wayphinder opened 1 year ago
Some previous context: https://github.com/pypi/warehouse/issues/5653
Other context: I'm talking about this with @wayphinder in person. It sounds like the main place where this causes problems for him is in the changelog_since_serial
endpoint, where e.action
gets munged:
My first thought here was to add another member to the end of the list that gets returned here, essentially trading a bit of extra response size for probably not breaking compatibility (since the list will only strictly increase in size, and pre-existing fields won't change). But that might also cause issues that I'm not aware of.
The primary known/supported use-case for this endpoint is PEP-381 and its most prominent implementation bandersnatch.
bandersnatch currently consumes changelog_since_serial in a way that would not choke on the proposed fix (adding another member to the end of the list): https://github.com/pypa/bandersnatch/blob/b3517c5acf696008da0ecd9544a4823a676191d1/src/bandersnatch/master.py#L207-L216
But in general I'm very hesitant to wake the XMLRPC dragon as we currently support it only for mirroring support and do not intend to take on new support for its use.
While changing the XML-RPC API would be great, for my use case a one-time dump of the current data in a lossless format would also work. A lot of the same data should be available in the BigQuery data set, but my understanding is that some historic data is missing, which is why I would like the changelog data.
What's the problem this feature will solve? The
_clean_for_xml
function removes some illegal characters. https://github.com/pypi/warehouse/blob/496338e94d6d62811671e7754507d3d8bc3942c0/warehouse/legacy/api/xmlrpc/views.py#L83-L93This makes it harder to correlate this information with other sources. E.g. the
action
field contains filenames, that might not match the actual filename because some characters are removed.Describe the solution you'd like Base64 or otherwise encode relevant fields in a way that does not remove data.
Additional context