claczny / VizBin

Repository of our application for human-augmented binning
27 stars 14 forks source link

Allowing to specify input files at the command-line #15

Open claczny opened 9 years ago

claczny commented 9 years ago

It would be nice to have a way to run VizBin in "non-interactive" mode. That means that the user should be able to set the values of the individual fields (e.g., textfield_file) at the command line (e.g., java -jar -Dtextfield_file=/path/to/file.fasta) and then let the whole thing run through all steps without showing the GUI and without requiring to mouse-click on the "Start"-button to execute the run.

This is particularly useful when wanting to integrate VizBin into a pipeline (e.g., to visualize individual clusters as they are returned from an automated binning algorithm). Of course, then, the user would not immediately want do any polygonal selection or such but rather have the points.txt file saved for later use.

This might be related to #11.

claczny commented 9 years ago

Currently running a rather large-scale visualization (211k points) via the command line and it seems to work nicely so far.

However, one problem exists: A confirmation/information dialog is displayed to inform about the number of ignored kmers due to non-DNA-alphabet letters. This should not happen on the command line as it will pause the execution by requiring to click on "Ok". This is correct for the GUI, but for the CLI it should just be displayed on the command line.

claczny commented 9 years ago

Results are in and look fine.

However, realized that the "Debug -> K-mer data" no longer works.

-> Open points:

piotr-gawron commented 9 years ago

should be fixed by now, test it and let me know

claczny commented 9 years ago

Have to test these still.

Additionally it appears that the labels/annotations, when provided in GUI-mode, are not respected.

piotr-gawron commented 9 years ago

Can you send me test file? Because it works for me :)

IzaakMiller commented 8 years ago

What was the end result of this? I'm interested in running this from the command line but I'm having a hard time figuring out how to do it. Did this ever get finished?

piotr-gawron commented 8 years ago

@imiller4 I did some dev to make it working, but all of my work is in devel branch. Cedric was planning to merge it, but I think he was too occupied with his PhD thesis/defence to test it and merge it. You can try it by downloading devel branch, compile it by executing command: ant jar from src/interface/VizBin directory and you will get Vizbin-dist.jar file that you can run

To run it from console the command goes like this: java -jar VizBin-dist.jar -i ../testFiles/smallInput/EqualSet02.fa -o test.txt

If you run it without parameters it should give you help menu with all available parameters.

If you have problems let me know. I haven't been working on the project for some time, but I should be able to fix problems.

jolespin commented 7 years ago

Kind of tangent but I think still relevant. I'm trying to figure out what the input parameters are to https://github.com/claczny/VizBin/blob/master/src/backend/bh_tsne/ptsne.cpp

It looks like the following:

  1. number of points
  2. number of PCA dimensions for input
  3. theta
  4. perplexity
  5. unknown?
  6. random_state

Then followed by PCs separated by space.

How would one use the ptsne.cpp directly on the commandline? (Apologies if this is getting technical, this implementation is just really fast and could be useful for other types of datasets)

Thinking about writing a Python wrapper for it to share here in case anyone would find it useful: https://github.com/scikit-learn/scikit-learn/issues/10023#issuecomment-340052406

29008
50
0.5
30.0
1
0
4.89921157518091 -0.9638536747252275 0.43668068008376854 1.1225920780123133 -3.916186188292354 -0.30470803504843585 -1.0530214774492137 -1.9880912151282137 0.8631048233222588 1.4790223048431803 1.0249498256553173 -0.8853645854433776 0.4108939764274563 -1.1218002496079318 0.6389739442706567 -1.1728560033336835 0.44705357653429467 -0.27976701019745814 -0.3893058185461304 1.084794143645103 

Example hosted at: https://drive.google.com/open?id=0Bx2FlTJ8g3XRa1F5OU9lc0NSTEU

piotr-gawron commented 7 years ago

Hi @jolespin,

I think your question doesn't have so much to do with original question so I would move it to separate thread :).

Anyway, answering your question. Take a look here. If you ignore comments (which are a bit out of date...) you will realize that 5th parameter is number of threads to be used.

If you plan to write wrapper for python then you need to have precompiled version for different OS.

iquasere commented 6 years ago

This CLI version of VizBin returns a coordinates file. Is there any sort of output that informs on which contigs are grouped into which automatic clusters?

claczny commented 6 years ago

Hi @iquasere,

VizBin was originally not designed to automatically bin sequence, but rather to put the human in the loop and allow expert input.

You can, however, use the 2D coordinates and run your clustering algorithm of choice on them. We found that DBSCAN is working nicely typically.

Hope this helps.

Best,

Cedric

iquasere commented 6 years ago

Thank you for the fast reply!

And congratulations for a nice looking tool!

claczny commented 6 years ago

Thx and glad you like it :)