Closed zyh4482 closed 2 years ago
I've successfully finished the test. I modified argument names
with following script to automatically input fastq of every sample inside the same folder:
import re
names=[]
for i in os.listdir(data_path):
portion = os.path.splitext(i)
tmp_out=portion[0][0:portion[0].rfind('_R')]
names.append(tmp_out)
names = sorted(list(set(names)))
May I ask an additional question here? I noticed that when doing GLM calculating, it reported many warning messages. For example:
Warning messages:
1: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class", :
Column 'junc_cdf1_glmnet_twostep' does not exist to remove
2: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class", :
Column 'refName_readStrandR1' does not exist to remove
3: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class", :
Column 'refName_readStrandR2' does not exist to remove
4: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class", :
Column 'gene_strandR1A_new' does not exist to remove
5: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class", :
Column 'gene_strandR1B_new' does not exist to remove
Warning messages:
1: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected", :
Column 'junc_cdf_glm' does not exist to remove
2: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected", :
Column 'junc_cdf_glm_corrected' does not exist to remove
3: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected", :
Column 'junc_cdf_glmnet' does not exist to remove
4: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected", :
Column 'junc_cdf_glmnet_constrained' does not exist to remove
5: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected", :
Column 'junc_cdf_glmnet_corrected' does not exist to remove
6: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected", :
Column 'junc_cdf_glmnet_corrected_constrained' does not exist to remove
Warning messages:
1: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected", :
Column 'p_predicted_glm' does not exist to remove
2: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected", :
Column 'p_predicted_corrected' does not exist to remove
3: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected", :
Column 'p_predicted_glmnet' does not exist to remove
4: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected", :
Column 'p_predicted_glmnet_constrained' does not exist to remove
5: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected", :
Column 'p_predicted_glmnet_corrected' does not exist to remove
6: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected", :
Column 'p_predicted_glmnet_corrected_constrained' does not exist to remove
Warning messages:
1: In `[.data.table`(GLM_output, , `:=`(frac_mutimapping, NULL)) :
Column 'frac_mutimapping' does not exist to remove
2: In `[.data.table`(GLM_output, , `:=`(train, NULL)) :
Column 'train' does not exist to remove
May I ask if these messages do harm to result?
Thank you.
These warning messages are normal. Each time the script is run, it checks to make sure that these columns for which you got the warning are not in the input file and otherwise they will be removed. These warnings are nothing to be worried about and you should get the output files fine.
For smart-seq2, there are many files with different prefixes. Despite I can write a script to create such a list and assign it to input argument
names
, it is inconvinient and the list could be very large. May I ask how do you usually make thenames
list for ss2 data? Thanks