Open JifengTang opened 7 years ago
Hi Jifeng, please try:
https://github.com/dib-lab/khmer/archive/master.zip
to get the latest master branch of khmer.
best, --titus
Dear Titus,
How should I install it? Because I have version 2.0 installed.
Can I simply copy “normalize-by-median.py” to the folder “khmerEnv/bin/” ?
Thank you very much.
Cheers, Jifeng From: C. Titus Brown [mailto:notifications@github.com] Sent: Tuesday, May 9, 2017 2:22 PM To: dib-lab/khmer khmer@noreply.github.com Cc: Jifeng Tang jifeng.tang@keygene.com; Author author@noreply.github.com Subject: Re: [dib-lab/khmer] khmer version 2.0 ERROR: I/O operation on closed file (#1693)
Hi Jifeng, please try:
https://github.com/dib-lab/khmer/archive/master.zip
to get the latest master branch of khmer.
best, --titus
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dib-lab/khmer/issues/1693#issuecomment-300146224, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AE2k_jknliO_elSFLfrqBTfNRf5P7b85ks5r4FpOgaJpZM4NVOPL.
Keygene N.V. | P.O. Box 216 | 6700 AE Wageningen | The Netherlands T (+31) 317 46 68 66 | F (+31) 317 42 49 39 | CoC. 09066631 | http://www.keygene.comhttp://www.keygene.com/
[http://www.keygene.com/images/keygenegeneral.jpg]http://www.keygene.com
Stay up-to-date! Subscribe to our bimonthly newsletter herehttp://www.keygene.com/newsletter
[http://www.keygene.com/images/linkedin-grey.png]https://www.linkedin.com/company/KeyGene [http://www.keygene.com/images/twitter-grey.png] https://twitter.com/KeyGeneInfo [http://www.keygene.com/images/facebook-grey.png] https://www.facebook.com/KeyGeneNV
The information contained in this message, and attachments if any, may be privileged and/or confidential and is intended to be received only by persons entitled to receive such information. Use of any part of this message and/or its attachments if any, in any other way than as explicitly stated by the sender is strictly prohibited. Should you receive this message unintentionally please notify the sender immediately, and delete it together with all attachments, if any. Thank you. The transmission of messages and/or information via the Internet is not secured and may be intercepted by third parties. KeyGene assumes no liability for any damage caused by any unintentional disclosure and/or use of the content of this message and attachments if any.
Sorry, I didn't give the complete command!
pip install https://github.com/dib-lab/khmer/archive/master.zip
will work. You cannot just copy normalize-by-median, I'm afraid ;).
Dear Titus,
It seems working. Total 78 fastq files, about half are processed.
I used: nohup /data/sag2/2017/JTA_tools/khmerupdate/khmerEnv/bin/normalize-by-median.py -o RNAseqNormalized.fastq -C 100 -s Kmerupdate.tables -R RNAseq_Reportupdate -M 1800000000000 ../RNAseqInput/*fastq >processupdate.out &
For “ –C” option, the default is 20. I changed to 100. Although I am not sure that I should change that.
I want to keep at least 100 coverage per transcript.
Is that “-C” for the whole dataset or per fastq file?
Thank you very much.
Cheers, Jifeng
-C CUTOFF, --cutoff CUTOFF when the median k-mer coverage level is above this number the read is not kept. (default: 20) From: C. Titus Brown [mailto:notifications@github.com] Sent: Tuesday, May 9, 2017 5:05 PM To: dib-lab/khmer khmer@noreply.github.com Cc: Jifeng Tang jifeng.tang@keygene.com; Author author@noreply.github.com Subject: Re: [dib-lab/khmer] khmer version 2.0 ERROR: I/O operation on closed file (#1693)
Sorry, I didn't give the complete command!
pip install https://github.com/dib-lab/khmer/archive/master.zip
will work. You cannot just copy normalize-by-median, I'm afraid ;).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dib-lab/khmer/issues/1693#issuecomment-300193893, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AE2k_gR4Xy-bn8eO5bNJb0AEeJFv7M4Dks5r4IB_gaJpZM4NVOPL.
Keygene N.V. | P.O. Box 216 | 6700 AE Wageningen | The Netherlands T (+31) 317 46 68 66 | F (+31) 317 42 49 39 | CoC. 09066631 | http://www.keygene.comhttp://www.keygene.com/
[http://www.keygene.com/images/keygenegeneral.jpg]http://www.keygene.com
Stay up-to-date! Subscribe to our bimonthly newsletter herehttp://www.keygene.com/newsletter
[http://www.keygene.com/images/linkedin-grey.png]https://www.linkedin.com/company/KeyGene [http://www.keygene.com/images/twitter-grey.png] https://twitter.com/KeyGeneInfo [http://www.keygene.com/images/facebook-grey.png] https://www.facebook.com/KeyGeneNV
The information contained in this message, and attachments if any, may be privileged and/or confidential and is intended to be received only by persons entitled to receive such information. Use of any part of this message and/or its attachments if any, in any other way than as explicitly stated by the sender is strictly prohibited. Should you receive this message unintentionally please notify the sender immediately, and delete it together with all attachments, if any. Thank you. The transmission of messages and/or information via the Internet is not secured and may be intercepted by third parties. KeyGene assumes no liability for any damage caused by any unintentional disclosure and/or use of the content of this message and attachments if any.
Hi Jifeng,
excellent.
-C is for the whole data set.
best, --titus
On Thu, May 11, 2017 at 05:47:14AM -0700, JifengTang wrote:
Dear Titus,
It seems working. Total 78 fastq files, about half are processed.
I used: nohup /data/sag2/2017/JTA_tools/khmerupdate/khmerEnv/bin/normalize-by-median.py -o RNAseqNormalized.fastq -C 100 -s Kmerupdate.tables -R RNAseq_Reportupdate -M 1800000000000 ../RNAseqInput/*fastq >processupdate.out &
For ??? ???C??? option, the default is 20. I changed to 100. Although I am not sure that I should change that.
I want to keep at least 100 coverage per transcript.
Is that ???-C??? for the whole dataset or per fastq file?
Thank you very much.
Cheers, Jifeng
-C CUTOFF, --cutoff CUTOFF when the median k-mer coverage level is above this number the read is not kept. (default: 20) From: C. Titus Brown [mailto:notifications@github.com] Sent: Tuesday, May 9, 2017 5:05 PM To: dib-lab/khmer khmer@noreply.github.com Cc: Jifeng Tang jifeng.tang@keygene.com; Author author@noreply.github.com Subject: Re: [dib-lab/khmer] khmer version 2.0 ERROR: I/O operation on closed file (#1693)
Sorry, I didn't give the complete command!
pip install https://github.com/dib-lab/khmer/archive/master.zip
will work. You cannot just copy normalize-by-median, I'm afraid ;).
??? You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dib-lab/khmer/issues/1693#issuecomment-300193893, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AE2k_gR4Xy-bn8eO5bNJb0AEeJFv7M4Dks5r4IB_gaJpZM4NVOPL.
Keygene N.V. | P.O. Box 216 | 6700 AE Wageningen | The Netherlands T (+31) 317 46 68 66 | F (+31) 317 42 49 39 | CoC. 09066631 | http://www.keygene.comhttp://www.keygene.com/
[http://www.keygene.com/images/keygenegeneral.jpg]http://www.keygene.com
Stay up-to-date! Subscribe to our bimonthly newsletter herehttp://www.keygene.com/newsletter
[http://www.keygene.com/images/linkedin-grey.png]https://www.linkedin.com/company/KeyGene [http://www.keygene.com/images/twitter-grey.png] https://twitter.com/KeyGeneInfo [http://www.keygene.com/images/facebook-grey.png] https://www.facebook.com/KeyGeneNV
The information contained in this message, and attachments if any, may be privileged and/or confidential and is intended to be received only by persons entitled to receive such information. Use of any part of this message and/or its attachments if any, in any other way than as explicitly stated by the sender is strictly prohibited. Should you receive this message unintentionally please notify the sender immediately, and delete it together with all attachments, if any. Thank you. The transmission of messages and/or information via the Internet is not secured and may be intercepted by third parties. KeyGene assumes no liability for any damage caused by any unintentional disclosure and/or use of the content of this message and attachments if any.
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/dib-lab/khmer/issues/1693#issuecomment-300778222 -- C. Titus Brown, ctbrown@ucdavis.edu
I see the issues https://github.com/dib-lab/khmer/issues/1320 , https://github.com/dib-lab/khmer/issues/1341 , but I do not find a solution. I really need to process a dataset quickly.
Anyway, the command I used (see the below) and the process was on a computer with 2T memory. nohup /.. /khmer/khmerEnv/bin/normalize-by-median.py -o RNAseqNormalized.fastq -C 100 -s Kmer.tables -R RNAseq_Report -M 1500000000000 ../RNAseqInput/*fastq >process.out &
Installation: mkdir khmer sudo apt-get install python2.7-dev python-virtualenv python-pip gcc g++ cd khmer/ curl -O https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.6.tar.gz tar xzf virtualenv cd virtualenv-; python2.7 virtualenv.py ../khmerEnv; cd .. source khmerEnv/bin/activate pip2 install khmer
(khmerEnv)$ normalize-by-median.py -h
usage: normalize-by-median.py [-h] [--version] [--ksize KSIZE] [--n_tables N_TABLES] [-U UNIQUE_KMERS] [--fp-rate FP_RATE] [--max-tablesize MAX_TABLESIZE | -M MAX_MEMORY_USAGE] [-q] [-C CUTOFF] [-p] [--force_single] [-u unpaired_reads_filename] [-s filename] [-R report_filename] [--report-frequency report_frequency] [-f] [-o filename] [-l filename] [--gzip | --bzip] input_sequence_filename [input_sequence_filename ...]
Do digital normalization (remove mostly redundant sequences)
positional arguments: input_sequence_filename Input FAST[AQ] sequence filename.