Open claczny opened 9 years ago
Currently running a rather large-scale visualization (211k points) via the command line and it seems to work nicely so far.
However, one problem exists: A confirmation/information dialog is displayed to inform about the number of ignored kmers due to non-DNA-alphabet letters. This should not happen on the command line as it will pause the execution by requiring to click on "Ok". This is correct for the GUI, but for the CLI it should just be displayed on the command line.
Results are in and look fine.
However, realized that the "Debug -> K-mer data" no longer works.
-> Open points:
should be fixed by now, test it and let me know
Have to test these still.
Additionally it appears that the labels/annotations, when provided in GUI-mode, are not respected.
Can you send me test file? Because it works for me :)
What was the end result of this? I'm interested in running this from the command line but I'm having a hard time figuring out how to do it. Did this ever get finished?
@imiller4 I did some dev to make it working, but all of my work is in devel branch. Cedric was planning to merge it, but I think he was too occupied with his PhD thesis/defence to test it and merge it. You can try it by downloading devel branch, compile it by executing command: ant jar from src/interface/VizBin directory and you will get Vizbin-dist.jar file that you can run
To run it from console the command goes like this: java -jar VizBin-dist.jar -i ../testFiles/smallInput/EqualSet02.fa -o test.txt
If you run it without parameters it should give you help menu with all available parameters.
If you have problems let me know. I haven't been working on the project for some time, but I should be able to fix problems.
Kind of tangent but I think still relevant. I'm trying to figure out what the input parameters are to https://github.com/claczny/VizBin/blob/master/src/backend/bh_tsne/ptsne.cpp
It looks like the following:
Then followed by PCs separated by space.
How would one use the ptsne.cpp directly on the commandline? (Apologies if this is getting technical, this implementation is just really fast and could be useful for other types of datasets)
Thinking about writing a Python
wrapper for it to share here in case anyone would find it useful:
https://github.com/scikit-learn/scikit-learn/issues/10023#issuecomment-340052406
29008
50
0.5
30.0
1
0
4.89921157518091 -0.9638536747252275 0.43668068008376854 1.1225920780123133 -3.916186188292354 -0.30470803504843585 -1.0530214774492137 -1.9880912151282137 0.8631048233222588 1.4790223048431803 1.0249498256553173 -0.8853645854433776 0.4108939764274563 -1.1218002496079318 0.6389739442706567 -1.1728560033336835 0.44705357653429467 -0.27976701019745814 -0.3893058185461304 1.084794143645103
Example hosted at: https://drive.google.com/open?id=0Bx2FlTJ8g3XRa1F5OU9lc0NSTEU
Hi @jolespin,
I think your question doesn't have so much to do with original question so I would move it to separate thread :).
Anyway, answering your question. Take a look here. If you ignore comments (which are a bit out of date...) you will realize that 5th parameter is number of threads to be used.
If you plan to write wrapper for python then you need to have precompiled version for different OS.
This CLI version of VizBin returns a coordinates file. Is there any sort of output that informs on which contigs are grouped into which automatic clusters?
Hi @iquasere,
VizBin was originally not designed to automatically bin sequence, but rather to put the human in the loop and allow expert input.
You can, however, use the 2D coordinates and run your clustering algorithm of choice on them. We found that DBSCAN is working nicely typically.
Hope this helps.
Best,
Cedric
Thank you for the fast reply!
And congratulations for a nice looking tool!
Thx and glad you like it :)
It would be nice to have a way to run VizBin in "non-interactive" mode. That means that the user should be able to set the values of the individual fields (e.g.,
textfield_file
) at the command line (e.g.,java -jar -Dtextfield_file=/path/to/file.fasta
) and then let the whole thing run through all steps without showing the GUI and without requiring to mouse-click on the "Start"-button to execute the run.This is particularly useful when wanting to integrate VizBin into a pipeline (e.g., to visualize individual clusters as they are returned from an automated binning algorithm). Of course, then, the user would not immediately want do any polygonal selection or such but rather have the
points.txt
file saved for later use.This might be related to #11.