nf-osi / nfportalutils

Utilities for NF Portal project and data management
https://nf-osi.github.io/nfportalutils/
MIT License
2 stars 2 forks source link

Increase safety of grant_specific_file_access #157

Closed allaway closed 7 months ago

allaway commented 9 months ago

I ran into an issue today when I called this:

nfportalutils::grant_specific_file_access(principal_id = 'XXXXXXX',
                                          entity_ids = foo$id, 
                                          create_dataset = T, 
                                          project_id = "syn4939902", 
                                          dataset_name = "Embargoed Data Request")

Where foo$id is:

c("syn15263929", "syn15263990", "syn15264247", "syn15264304", "syn15264357", "syn15264402", 
             "syn15267591", "syn15267645", "syn15267941", "syn15267989", "syn15268020", "syn15268049", 
             "syn15590029", "syn15590030", "syn15590035", "syn15590036", "syn15590037", "syn15590038", 
             "syn15590093", "syn15590094", "syn15590099", "syn15590101", "syn15590102", "syn15590103", 
             "syn15590194", "syn15590195", "syn15590211", "syn15590214", "syn15590218", "syn15590222", 
             "syn15590302", "syn15590303", "syn15590309", "syn15590310", "syn15590311", "syn15590312", 
             "syn15590365", "syn15590366", "syn15590371", "syn15590372", "syn15590373", "syn15590374", 
             "syn15590429", "syn15590430", "syn15590435", "syn15590436", "syn15590437", "syn15590438", 
             "syn23564011", "syn23564018", "syn23564419", "syn23564421", "syn23564447", "syn23564448", 
             "syn23564540", "syn23564544", "syn23564548", "syn23564551", "syn23564562", "syn23564563", 
             "syn23564573", "syn23564575", "syn23564583", "syn23564584", "syn23564585", "syn23564589", 
             "syn23564599", "syn23564600", "syn23564751", "syn23564754", "syn23564777", "syn23569731", 
             "syn23569734", "syn23569745", "syn23569776", "syn23569779", "syn23569866", "syn23569867", 
             "syn23569876", "syn23569877", "syn23569881", "syn23569882", "syn26449977", "syn26449980", 
             "syn26450036", "syn26450040", "syn26450102", "syn26450105", "syn26450125", "syn26450132", 
             "syn26450141", "syn26450145", "syn26450163", "syn26450166", "syn26450194", "syn26450203", 
             "syn26450293", "syn26450296", "syn26450299", "syn26450302", "syn26450305", "syn26450324", 
             "syn26450353", "syn26450356", "syn26450359", "syn26450363", "syn26470324", "syn26470325", 
             "syn26470326", "syn26470327", "syn26470328", "syn26470329", "syn26470330", "syn26470331", 
             "syn26470332", "syn26470333", "syn26470343", "syn26470344", "syn26470346", "syn26470347", 
             "syn26470348", "syn26470349", "syn26470358", "syn26470359", "syn26470360", "syn26470361", 
             "syn26470362", "syn26470363", "syn26470364", "syn26470365", "syn42494657", "syn42494780", 
             "syn42494918", "syn42495108", "syn42495247", "syn42495388", "syn42495643", "syn42496010", 
             "syn42496632", "syn42496805", "syn42496944", "syn42497095", "syn42497605", "syn42497715", 
             "syn42498285", "syn42498394", "syn42499504", "syn42499571", "syn42499654", "syn42499748", 
             "syn42499885", "syn42500048", "syn42500282", "syn42500555", "syn42501335", "syn42501571", 
             "syn42501651", "syn42501733", "syn42502017", "syn42502102", "syn42502501", "syn42502622", 
             "syn42505441", "syn42506013", "syn42506231", "syn42506456", "syn42506737", "syn42507029", 
             "syn42507347", "syn42507708", "syn42511268", "syn42511501", "syn42511705", "syn42511911", 
             "syn42512118", "syn42512355", "syn42512591", "syn42512831", "syn42513106", "syn42513434", 
             "syn42515415", "syn42515664", "syn42516848", "syn42517206", "syn42518546", "syn42518714", 
             "syn42519555", "syn42519802", "syn42520899", "syn42521250", "syn42523614", "syn42523941", 
             "syn42524238", "syn42524531", "syn42524677", "syn42524731", "syn42524778", "syn42524824", 
             "syn42524869", "syn42524919", "syn42525058", "syn42525100", "syn47887173", "syn47887429", 
             "syn47888039", "syn47888593", "syn47888897", "syn47890513", "syn47891367", "syn47891823", 
             "syn47894617", "syn47895183", "syn47896944", "syn47897283", "syn47897523", "syn47897747", 
             "syn47899102", "syn47899382", "syn47902580", "syn47902763", "syn47903154", "syn47903630", 
             "syn47905003", "syn47905209", "syn47905483", "syn47905977", "syn47906435", "syn47907611", 
             "syn47907864", "syn47908215", "syn47908536", "syn47908960", "syn47909658", "syn47909935", 
             "syn47992847", "syn47993917", "syn47994219", "syn47994628", "syn47999537", "syn47999908", 
             "syn48000588", "syn48001602", "syn48001754", "syn48001903", "syn48002338", "syn48003313", 
             "syn51614242", "syn51614243", "syn51614246", "syn51614248", "syn51614250", "syn51614253", 
             "syn51614254", "syn51614256", "syn51614272", "syn51614280", "syn51614284", "syn51614301")

The error was

Error in py_call_impl(callable, call_args$unnamed, call_args$named) : 
  synapseclient.core.exceptions.SynapseHTTPError: 400 Client Error: 
Duplicate column name: 'age'

I suspect that this is caused when the Dataset function tries to add annotation columns to the dataset automatically - I've seen this on synapse before where different annotations (e.g. int on one file, char on another file) causes the creation of multiple columns in the view/dataset schema.

This update makes this a bit more safe by catching these errors and trying to build the dataset again w/o automagically adding the annotation cols.

To test, I think you can just run the function above with your principalId and a sandbox project_id.

allaway commented 9 months ago

Now that I'm thinking about it, maybe the place to change this is actually in:

https://github.com/nf-osi/nfportalutils/blob/93a65d6e572592a05b83adf6608cd3d1a68a6282/R/datasets.R#L151