Caleydo / caleydo

Caleydo - Visualization for Molecular Biology
http://caleydo.org
BSD 3-Clause "New" or "Revised" License
58 stars 14 forks source link

Generate TCGA cal files for release #1565

Closed mstreit closed 11 years ago

mstreit commented 11 years ago

We should probably wait for Nils to confirm the 40% threshold in the sampling. Other than that we are ready to go.

mstreit commented 11 years ago

@ngehlenborg RFC

ngehlenborg commented 11 years ago

Firehose uses 80% (at least in the code that I have seen) plus imputation of missing values (#1555). Should we stick to this for now to make sure that we get similar gene lists when sampling?

The look of the matrices with missing values, however, is concerning me a bit. I think imputation of missing values should have a higher priority than I originally thought.

mstreit commented 11 years ago

Related to #1534

mstreit commented 11 years ago

@ngehlenborg OK, the version in the repository uses 80%.

mstreit commented 11 years ago

Also waiting for #1585

mstreit commented 11 years ago

1585 is fixed.

We decided to not wait for #1555. That means we are ready to generate the TCGA cal files.

sgratzl commented 11 years ago

what clusterer should we use: kmeans, tree or affinity?

mstreit commented 11 years ago

tree

sgratzl commented 11 years ago

what we also have to discuss is, where to put them on the server.

currently the tcga data browser is looking within the 3.0 directory like the pathway/mapping cache loader

the projects claim to be stored in the 3.0.2 directory.

mstreit commented 11 years ago

We somehow need an indicator in the code that tells us when the data packages are incompatible to an old version. I think we will not automatically derive this information from the version number. @sgratzl Do you have a suggestions how to address this issue?

sgratzl commented 11 years ago

We somehow need an indicator in the code that tells us when the data packages are incompatible to an old version. I think we will not automatically derive this information from the version number.

the data packages meta info file contain the version with which caleydo version they are produced. That is not the problem.

It is again about where to put the files on the server. As I don't know whether we are upward-compatible (caleydo 3.0.0 opening a 3.0.2. project)

mstreit commented 11 years ago

Being downward compatible is important. I would ignore upward-compatibility for now. People should use the latest version, however, they should be able to load there old projects if possible.

mstreit commented 11 years ago

Let's stick to our policy that all 3.0.* packages are compatible with the current build (3.0.2). So the TCGA packages as well as the sample projects will be stored under "3.0" on the server. However, we should rename "3.0/tcga" to "3.0/tcga_sampled" before moving all new TCGA packages with the full matrices to "3.0/tcga".

ngehlenborg commented 11 years ago

Yes, agree with all points!

Sent from a mobile device.

On Aug 30, 2013, at 7:26 AM, Marc Streit notifications@github.com wrote:

Let's stick to our policy that all 3.0.* packages are compatible with the current build (3.0.2). So the TCGA packages as well as the sample projects will be stored under "3.0" on the server. However, we should rename "3.0/tcga" to "3.0/tcga_sampled" before moving all new TCGA packages with the full matrices to "3.0/tcga".

— Reply to this email directly or view it on GitHub.

sgratzl commented 11 years ago

2013-05-21, 2013-04-23, 2013-03 and 2013-02 is online, rest on demand