Too many prefixes in input argument "names" for ss2 data

zyh4482 commented 2 years ago

For smart-seq2, there are many files with different prefixes. Despite I can write a script to create such a list and assign it to input argument names, it is inconvinient and the list could be very large. May I ask how do you usually make the names list for ss2 data? Thanks

zyh4482 commented 2 years ago

I've successfully finished the test. I modified argument names with following script to automatically input fastq of every sample inside the same folder:

import re

names=[]
for i in os.listdir(data_path):
    portion = os.path.splitext(i)
    tmp_out=portion[0][0:portion[0].rfind('_R')]
    names.append(tmp_out)
names = sorted(list(set(names)))

May I ask an additional question here? I noticed that when doing GLM calculating, it reported many warning messages. For example:

Warning messages:
1: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class",  :
  Column 'junc_cdf1_glmnet_twostep' does not exist to remove
2: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class",  :
  Column 'refName_readStrandR1' does not exist to remove
3: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class",  :
  Column 'refName_readStrandR2' does not exist to remove
4: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class",  :
  Column 'gene_strandR1A_new' does not exist to remove
5: In `[.data.table`(class_input, , `:=`(c("cur_weight", "train_class",  :
  Column 'gene_strandR1B_new' does not exist to remove

Warning messages:
1: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected",  :
  Column 'junc_cdf_glm' does not exist to remove
2: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected",  :
  Column 'junc_cdf_glm_corrected' does not exist to remove
3: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected",  :
  Column 'junc_cdf_glmnet' does not exist to remove
4: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected",  :
  Column 'junc_cdf_glmnet_constrained' does not exist to remove
5: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected",  :
  Column 'junc_cdf_glmnet_corrected' does not exist to remove
6: In `[.data.table`(class_input, , `:=`(c("junc_cdf_glm", "junc_cdf_glm_corrected",  :
  Column 'junc_cdf_glmnet_corrected_constrained' does not exist to remove
Warning messages:
1: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected",  :
  Column 'p_predicted_glm' does not exist to remove
2: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected",  :
  Column 'p_predicted_corrected' does not exist to remove
3: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected",  :
  Column 'p_predicted_glmnet' does not exist to remove
4: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected",  :
  Column 'p_predicted_glmnet_constrained' does not exist to remove
5: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected",  :
  Column 'p_predicted_glmnet_corrected' does not exist to remove
6: In `[.data.table`(class_input, , `:=`(c("p_predicted_glm", "p_predicted_corrected",  :
  Column 'p_predicted_glmnet_corrected_constrained' does not exist to remove

Warning messages:
1: In `[.data.table`(GLM_output, , `:=`(frac_mutimapping, NULL)) :
  Column 'frac_mutimapping' does not exist to remove
2: In `[.data.table`(GLM_output, , `:=`(train, NULL)) :
  Column 'train' does not exist to remove

May I ask if these messages do harm to result?

Thank you.

roozbehdn commented 2 years ago

These warning messages are normal. Each time the script is run, it checks to make sure that these columns for which you got the warning are not in the input file and otherwise they will be removed. These warnings are nothing to be worried about and you should get the output files fine.

salzman-lab / SICILIAN

Too many prefixes in input argument "names" for ss2 data #15