Open hknahal opened 5 years ago
@andricDu Just some more details. It looks like this bug is only affecting the PBCA-US project. When I'm not logged into the Portal and go to https://dcc.icgc.org/releases/current/Projects/PBCA-US, I can see the "simple_somatic_mutation.open.PBCA-US.tsv.gz" file:
But if I log in, the simple_somatic_mutation file disappears:
Expected behaviour:
The controlled version of the SSM file (ie. "simple_somatic_mutation.controlled.PBCA-US.tsv.gz") should always exist if a user is logged into the Portal. By default:
So even if a project does not have any masked SSMs, the "simple_somatic_mutation.controlled.[ICGC-PROJECT].tsv.gz" file will still appear when the user is logged in and it will basically be a copy of the open version.
I know we apply different masking rules to US projects (TARGET and TCGA). Since PBCA-US is the only US project that is not TARGET or TCGA, is it possible that we treated this project differently during the SSM masking step, and somehow the controlled version of SSM file is not being made available?
After a discussion with Dusan, this is acceptable. Those files were deleted manually from the database.
When you are logged in, if you have daco access it gives you the UNMASKED donors. This does not exist for the kidsfirst donors.
This is not a BUG - We need to investigate how to show the open access mutations by connecting to the production HDFS.
Similar issue reported by Lincoln, in the releases section:
Here's some bad behaviour in the data releases page, regarding when controlled and open tier data are displayed:
Question: Can we adjust this behavior so that:
We cannot show the open and controlled files together in the same directory; this would require a different design of the system.
Some options we can explore are:
.open
from some files, and they could show up when the user logs in { "code": 403, "message": "Forbidden - Please login or check permissions to access this resource." }
There's only 1 file in the PCAWG directory that needs to be renamed. Please remove .open
from PCAWG/consensus_snv_indel/final_consensus_snv_indel_passonly_icgc.open.tgz
While adding verbiage to Having trouble downloading?
, please update the link to http://docs.icgc.org/download/repositories
so users don't get that annoying redirect message.
@christinayung tracking of tasks mentioned here: https://github.com/icgc-dcc/dcc-portal/issues/647
The open-access portion of simple somatic mutation data download becomes unavailable when a user is logged into the Data Portal. This issue was brought to attention when a user recently contacted us (see ticket at https://extsd.oicr.on.ca/projects/ICGCSD/queues/custom/11/ICGCSD-2518) about not being able to download simple somatic mutation data at https://dcc.icgc.org/donors/DO232224 when they were logged into the Data Portal. I can confirm their OpenID is correct and they do have access to controlled data. When I tried to replicate his issue by going to https://dcc.icgc.org/donors/DO232224 while not logged in, clicking on "Download Donor Data", I was able to see "Simple Somatic Mutation" in the list:
But if I am logged into the Data Portal, the "Simple Somatic Mutation" disappears from the list:
This donor does have (open-access) SSM data, so it should be available to download whether the user is logged in or not.