dputhier / libgtftk

gtftk C Library and program
GNU General Public License v3.0
3 stars 2 forks source link

First develop_3.3 version #49

Closed fafa13 closed 6 years ago

fafa13 commented 6 years ago

I've just pushed into develop_3.3 the first try to manage indexes in columns. In the model, only INDEX and COLUMN structures are modified, so I think you don't need to do nothing because normally, you don't use those structures. Please test this version asap and let me know the results.

dputhier commented 6 years ago

I have created a branch (develop_3.3) in gtftk that use the develop_3.3 version of libgtftk. At the moment we are still getting lots of error but I guess this is encouraging.

They are several tests that fail for select_by_key which is of bad prognosis for other commands;

$ make bats_cmd CMD=select_by_key
 ✗ select_by_key_1
   (in test file gtftk_test.bats.sub, line 8)
     `[ "$result" -eq 5 ]' failed
 ✗ select_by_key_2
   (in test file gtftk_test.bats.sub, line 14)
     `[ "$result" -eq 9 ]' failed with status 2
   /var/folders/nl/stqvvbcn4pg9v65k8kyjv5yh0000gn/T/bats.84371.src: line 14: [: : integer expression expected
 ✗ select_by_key_3
   (in test file gtftk_test.bats.sub, line 20)
     `[ "$result" -eq 10 ]' failed
 ✗ select_by_key_4
   (in test file gtftk_test.bats.sub, line 26)
     `[ "$result" -eq 9 ]' failed with status 2
   /var/folders/nl/stqvvbcn4pg9v65k8kyjv5yh0000gn/T/bats.84371.src: line 26: [: : integer expression expected
 ✓ select_by_key_5
 ✗ select_by_key_6
   (in test file gtftk_test.bats.sub, line 38)
     `[ "$result" -eq 18 ]' failed
 ✗ select_by_key_7
   (in test file gtftk_test.bats.sub, line 45)
     `[ "$result" -eq 60 ]' failed
 ✓ select_by_key_8
 ✓ select_by_key_9
 ✓ select_by_key_10
 ✗ select_by_key_11
   (in test file gtftk_test.bats.sub, line 69)
     `[ "$result" -eq 25 ]' failed
 ✗ select_by_key_12
   (in test file gtftk_test.bats.sub, line 75)
     `[ "$result" -eq 45 ]' failed
 ✗ select_by_key_13
   (in test file gtftk_test.bats.sub, line 81)
     `[ "$result" -eq 15 ]' failed
 ✗ select_by_key_14
   (in test file gtftk_test.bats.sub, line 87)
     `[ "$result" = "2,2,21,27,32,49,64,64,106,106,124,124,175,179,209," ]' failed

14 tests, 10 failures
make: *** [bats_cmd] Error 1

Running the first test of select_by_key gives:

$ gtftk select_by_key  -k gene_id -v G0003 -i gtftk/data/simple/simple.gtf        
chr1  gtftk gene  50  61  . - . gene_id "G0003";
chr1  gtftk transcript  50  61  . - . gene_id "G0003"; transcript_id "G0003T001";
chr1  gtftk exon  50  54  . - . gene_id "G0003"; transcript_id "G0003T001"; exon_id "G0003T001E001";
chr1  gtftk exon  57  61  . - . gene_id "G0003"; transcript_id "G0003T001"; exon_id "G0003T001E002";
chr1  gtftk CDS 50  52  . - . gene_id "G0003"; transcript_id "G0003T001"; ccds_id "CDS_G0003T001";
Segmentation fault: 11

Note that commenting the freeing function makes all tests from select_by_key ok. There is something to dig here

$ make bats_cmd CMD=select_by_key
 ✓ select_by_key_1
 ✓ select_by_key_2
 ✓ select_by_key_3
 ✓ select_by_key_4
 ✓ select_by_key_5
 ✓ select_by_key_6
 ✓ select_by_key_7
 ✓ select_by_key_8
 ✓ select_by_key_9
 ✓ select_by_key_10
 ✓ select_by_key_11
 ✓ select_by_key_12
 ✓ select_by_key_13
 ✓ select_by_key_14

14 tests, 0 failures

The results for the ~200 first tests is provided below.

(gtftk_conda) puthier-mac-book puthier ~/git/project_dev/gtftk
$ make test
 ✗ short_long_1
   (in test file gtftk_test.bats, line 6)
     `[ $result -eq 53 ]' failed
 ✗ short_long_2
   (in test file gtftk_test.bats, line 11)
     `result=$(gtftk short_long -i gtftk/data/simple_03/simple_short_long.gtf| grep -c -E 'G0001T002|G0002T002|G0003T002|G0006T002|G0008T002|G0011T001')' failed
 ✓ short_long_3
 ✗ short_long_4
   (in test file gtftk_test.bats, line 23)
     `result=$(gtftk short_long -i gtftk/data/simple_03/simple_short_long.gtf -l | grep -c -E 'G0001T001|G0002T001|G0003T001|G0006T001|G0008T001|G0011T002')' failed
 ✓ short_long_5
 ✗ short_long_6
   (in test file gtftk_test.bats, line 36)
     `[ $result -eq 11 ]' failed
 ✓ retrieve_1
 ✓ retrieve_2
 ✓ retrieve_3
 ✓ retrieve_4
 ✓ retrieve_5
 ✗ alternative_5p_exon_1
   (in test file gtftk_test.bats, line 82)
     `[ "$result" -eq 10 ]' failed
       |--- 17:04-ERROR-alternative_5p_exon : Could not find any 'transcript' or 'gene'  line in the file. Is this GTF file in ensembl format ? Try to convert it with convert_ensembl command.
 ✗ alternative_5p_exon_1
   (in test file gtftk_test.bats, line 82)
     `[ "$result" -eq 10 ]' failed
       |--- 17:04-ERROR-alternative_5p_exon : Could not find any 'transcript' or 'gene'  line in the file. Is this GTF file in ensembl format ? Try to convert it with convert_ensembl command.
 ✓ join_attr_1
 ✓ join_attr_2
 ✓ join_attr_3
 ✓ join_attr_4
 ✓ join_attr_5
 ✓ join_attr_6
 ✓ join_attr_7
 ✓ join_attr_8
 ✓ join_attr_9
 ✓ merge_attr_1
 ✗ rm_dup_tss_1
   (in test file gtftk_test.bats, line 158)
     `[ $result -eq 47 ]' failed
 ✗ rm_dup_tss_2
   (in test file gtftk_test.bats, line 163)
     `result=$(gtftk rm_dup_tss -i gtftk/data/simple_03/simple_rm.gtf | grep -c -E 'G0001T002|G0003T001|G0004T002|G0006T001|G0007T001|G0008T001')' failed
 ✓ rm_dup_tss_3
 ✓ rm_dup_tss_4
 ✗ convergent_1
   (in test file gtftk_test.bats, line 185)
     `[ "$result" -eq 3 ]' failed
 ✓ convergent_2
 ✗ convergent_3
   (in test file gtftk_test.bats, line 197)
     `[ "$result" -eq 25 ]' failed
 ✗ bed_to_gtf_1
   (in test file gtftk_test.bats, line 204)
     `[ "$result" -eq 15 ]' failed
 ✗ bed_to_gtf_2
   (in test file gtftk_test.bats, line 210)
     `[ "$result" -eq 9 ]' failed with status 2
   /var/folders/nl/stqvvbcn4pg9v65k8kyjv5yh0000gn/T/bats.57973.src: line 210: [: : integer expression expected
 ✓ count_1
 ✓ tabulate_1
 ✓ tabulate_2
 ✓ tabulate_3
 ✓ tabulate_4
 ✓ tabulate_5
 ✓ tabulate_6
 ✓ tabulate_7
 ✓ tabulate_8
 ✓ discretize_key_1
 ✓ discretize_key_2
 ✓ discretize_key_3
 ✓ add_prefix_1
 ✓ add_prefix_1
 ✓ add_prefix_2
 ✓ add_prefix_3
 ✓ add_prefix_4
 ✓ add_prefix_5
 ✓ add_prefix_6
 ✓ add_prefix_7
 ✗ convert_ensembl_1
   (in test file gtftk_test.bats, line 2197)
     `[ "$result" = "125,180,50,65,33,22,107,210,3,176," ]' failed
 ✗ convert_ensembl_2
   (in test file gtftk_test.bats, line 2202)
     `[ "$result" = "138,189,61,76,47,35,116,222,14,186," ]' failed
 ✗ convert_ensembl_3
   (in test file gtftk_test.bats, line 2207)
     `[ "$result" = "125,125,180,50,65,65,33,22,28,107,107,210,3,3,176," ]' failed
 ✗ convert_ensembl_4
   (in test file gtftk_test.bats, line 2212)
     `[ "$result" = "138,138,189,61,76,76,47,35,35,116,116,222,14,14,186," ]' failed
 ✓ convert_ensembl_5
 ✓ convert_ensembl_6
 ✓ convert_ensembl_7
 ✓ convert_ensembl_8
 ✓ convert_ensembl_9
 ✓ convert_ensembl_10
 ✓ convert_ensembl_11
 ✓ convert_ensembl_12
 ✓ heatmap_1
 ✓ heatmap_2
 ✓ heatmap_3
 ✓ heatmap_4
 ✓ heatmap_5
 ✓ heatmap_6
 ✓ heatmap_7
 ✓ heatmap_8
 ✓ heatmap_9
 ✓ heatmap_10
 ✓ heatmap_11
 ✓ heatmap_12
 ✓ select_by_regexp_1
 ✓ select_by_regexp_2
 ✓ select_by_regexp_3
 ✓ select_by_regexp_4
 ✓ midpoints_1
 ✓ midpoints_2
 ✓ midpoints_3
 ✓ midpoints_4
 ✓ midpoints_5
 ✓ midpoints_6
 ✓ midpoints_7
 ✓ midpoints_8
 ✓ midpoints_9
 ✓ midpoints_10
 ✗ divergent_1
   (in test file gtftk_test.bats, line 599)
     `[ "$result" = "G0003T001,G0004T001,G0004T002," ]' failed
 ✗ divergent_2
   (in test file gtftk_test.bats, line 605)
     `[ "$result" = "G0003T001,G0004T001,G0004T002," ]' failed
 ✗ divergent_3
   (in test file gtftk_test.bats, line 611)
     `[ "$result" -eq 70 ]' failed
 ✗ divergent_4
   (in test file gtftk_test.bats, line 617)
     `[ "$result" -eq 25 ]' failed
 ✗ divergent_5
   (in test file gtftk_test.bats, line 623)
     `[ "$result" -eq 4 ]' failed
 ✗ select_most_5p_tx_1
   (in test file gtftk_test.bats, line 631)
     `[ "$result" -eq 40 ]' failed
 ✗ select_most_5p_tx_2
   (in test file gtftk_test.bats, line 637)
     `[ "$result" -eq 50 ]' failed
 ✓ mk_matrix_1
 ✓ mk_matrix_2
 ✓ mk_matrix_3
 ✓ mk_matrix_4
 ✓ mk_matrix_5
 ✓ mk_matrix_6
 ✓ mk_matrix_7
 ✓ mk_matrix_8
 ✓ mk_matrix_9
 ✓ mk_matrix_10
 ✓ mk_matrix_11
 ✓ mk_matrix_12
 ✓ mk_matrix_13
 ✓ mk_matrix_14
 ✓ mk_matrix_15
 ✗ get_feat_seq_1
   (in test file gtftk_test.bats, line 745)
     `[ "$result" = "caagc,taatt," ]' failed
 ✗ get_feat_seq_2
   (in test file gtftk_test.bats, line 751)
     `[ "$result" = "gcttg,aatta," ]' failed
 ✗ get_feat_seq_3
   (in test file gtftk_test.bats, line 757)
     `[ "$result" = "tct,g,gc," ]' failed
 ✓ get_feat_seq_4
 ✗ get_feat_seq_5
   (in test file gtftk_test.bats, line 769)
     `[ "$result" = "atgt,aat,ag," ]' failed
 ✗ get_feat_seq_6
   (in test file gtftk_test.bats, line 775)
     `[ "$result" = "atgt,aat,ag," ]' failed
 ✗ get_feat_seq_7
   (in test file gtftk_test.bats, line 781)
     `[ "$result" = ">G0006T001|G0006|chr1|22|35|CDS|22|25,>G0006T001|G0006|chr1|22|35|CDS|28|30,>G0006T001|G0006|chr1|22|35|CDS|33|34," ]' failed
 ✓ get_example_1
 ✗ select_by_loc_1
   (in test file gtftk_test.bats, line 797)
     `[ "$result" = "G0002T001,G0010T001," ]' failed
   nb_loc = 1; nb_row = 2
 ✗ select_by_loc_2
   (in test file gtftk_test.bats, line 803)
     `[ "$result" = "G0010T001," ]' failed
   nb_loc = 1; nb_row = 1
 ✓ select_by_loc_3
 ✓ select_by_loc_4
 ✗ select_by_loc_5
   (in test file gtftk_test.bats, line 821)
     `[ "$result" = "G0001T001,G0001T002,G0007T001,G0007T002," ]' failed
   nb_loc = 1; nb_row = 4
 ✗ select_by_loc_6
   (in test file gtftk_test.bats, line 827)
     `[ "$result" -eq 4 ]' failed
       |--- 17:07-ERROR-select_by_loc : Could not find any 'transcript' or 'gene'  line in the file. Is this GTF file in ensembl format ? Try to convert it with convert_ensembl command.
 ✗ select_by_loc_7
   (in test file gtftk_test.bats, line 833)
     `[ "$result" -eq 4 ]' failed
       |--- 17:07-ERROR-select_by_loc : Could not find any 'transcript' or 'gene'  line in the file. Is this GTF file in ensembl format ? Try to convert it with convert_ensembl command.
 ✓ select_by_loc_8
 ✗ select_by_loc_9
   (in test file gtftk_test.bats, line 846)
     `[ "$result" -eq 4 ]' failed
   nb_loc = 1; nb_row = 1
 ✗ select_by_loc_10
   (in test file gtftk_test.bats, line 852)
     `[ "$result" -eq 7 ]' failed
   nb_loc = 1; nb_row = 1
 ✓ select_by_loc_11
 ✗ select_by_loc_12
   (in test file gtftk_test.bats, line 864)
     `[ "$result" -eq 1 ]' failed
   nb_loc = 1; nb_row = 1
 ✓ select_by_loc_13
 ✓ select_by_loc_14
 ✗ select_by_loc_15
   (in test file gtftk_test.bats, line 882)
     `[ "$result" -eq 3 ]' failed
   nb_loc = 1; nb_row = 1
 ✗ select_by_loc_16
   (in test file gtftk_test.bats, line 887)
     `[ "$result" -eq 66 ]' failed
   nb_loc = 1; nb_row = 1
 ✗ select_by_loc_17
   (in test file gtftk_test.bats, line 892)
     `[ "$result" -eq 6 ]' failed
   nb_loc = 1; nb_row = 2
 ✗ select_by_loc_18
   (in test file gtftk_test.bats, line 897)
     `[ "$result" -eq 54 ]' failed
   nb_loc = 1; nb_row = 2
 ✗ select_by_loc_19
   (in test file gtftk_test.bats, line 903)
     `[ "$result" -eq 7 ]' failed
   nb_loc = 1; nb_row = 1
 ✗ select_by_loc_20
   (in test file gtftk_test.bats, line 909)
     `[ "$result" -eq 63 ]' failed
   nb_loc = 1; nb_row = 1
 ✓ del_attr_1
 ✓ del_attr_1
 ✓ select_by_max_exon_nb_1
 ✓ select_by_max_exon_nb_2
 ✓ select_by_max_exon_nb_3
 ✗ select_by_intron_size_1
   (in test file gtftk_test.bats, line 954)
     `[ "$result" = "G0001T001,G0001T002,G0002T001,G0003T001,G0004T001,G0004T002,G0006T001,G0006T002,G0007T001,G0007T002,G0009T001,G0009T002,G0010T001," ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7f8378c64a90, id=4577528848, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0001T002,G0001T001...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=52, file_name=-, ptr_addr=0x7f8378c562d0, id=4577233808, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7f8378c64a90, id=4577528848, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=52, file_name=-, ptr_addr=0x7f8378c562d0, id=4577233808, nb=2).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=52, file_name=-, ptr_addr=0x7f8378c562d0, id=4577233808, nb=2).
 ✗ select_by_intron_size_2
   (in test file gtftk_test.bats, line 960)
     `[ "$result" -eq 52 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7f9205b089c0, id=4636507152, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0001T002,G0001T001...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=52, file_name=-, ptr_addr=0x7f9205a58740, id=4636212112, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7f9205b089c0, id=4636507152, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=52, file_name=-, ptr_addr=0x7f9205a58740, id=4636212112, nb=2).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=52, file_name=-, ptr_addr=0x7f9205a58740, id=4636212112, nb=2).
 ✗ select_by_intron_size_3
   (in test file gtftk_test.bats, line 966)
     `[ "$result" -eq 32 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7fa81f2bd8e0, id=4501466128, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0003T001,G0004T002...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=32, file_name=-, ptr_addr=0x7fa81cc9bc10, id=4501171088, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7fa81f2bd8e0, id=4501466128, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=32, file_name=-, ptr_addr=0x7fa81cc9bc10, id=4501171088, nb=2).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=32, file_name=-, ptr_addr=0x7fa81cc9bc10, id=4501171088, nb=2).
 ✗ select_by_intron_size_4
   (in test file gtftk_test.bats, line 972)
     `[ "$result" = "G0001T001,G0001T002,G0002T001,G0003T001,G0004T001,G0004T002,G0006T001,G0006T002,G0007T001,G0007T002,G0008T001,G0009T001,G0009T002,G0010T001," ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7fae3e283110, id=4443081744, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0001T002,G0001T001...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=56, file_name=-, ptr_addr=0x7fae3bc902e0, id=4442786704, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7fae3e283110, id=4443081744, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=56, file_name=-, ptr_addr=0x7fae3bc902e0, id=4442786704, nb=2).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=56, file_name=-, ptr_addr=0x7fae3bc902e0, id=4442786704, nb=2).
 ✗ select_by_intron_size_5
   (in test file gtftk_test.bats, line 978)
     `[ "$result" -eq 56 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7fe1f03b21d0, id=4372737040, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0001T002,G0001T001...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=56, file_name=-, ptr_addr=0x7fe1edc91530, id=4372442000, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7fe1f03b21d0, id=4372737040, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=56, file_name=-, ptr_addr=0x7fe1edc91530, id=4372442000, nb=2).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=56, file_name=-, ptr_addr=0x7fe1edc91530, id=4372442000, nb=2).
 ✗ select_by_intron_size_6
   (in test file gtftk_test.bats, line 984)
     `[ "$result" -eq 28 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7ff2f36b1230, id=4572728336, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0003T001,G0004T002...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=28, file_name=-, ptr_addr=0x7ff2f35d98e0, id=4572433296, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7ff2f36b1230, id=4572728336, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=28, file_name=-, ptr_addr=0x7ff2f35d98e0, id=4572433296, nb=2).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=28, file_name=-, ptr_addr=0x7ff2f35d98e0, id=4572433296, nb=2).
 ✗ select_by_intron_size_7
   (in test file gtftk_test.bats, line 990)
     `[ "$result" -eq 14 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7ff0270bf310, id=4566649872, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling add_attr_from_dict
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7ff025df0190, id=4561778832, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7ff0270bf310, id=4566649872, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0001T002,G0001T001...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=56, file_name=-, ptr_addr=0x7ff025ddf8d0, id=4566354832, nb=3).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7ff025df0190, id=4561778832, nb=2).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=56, file_name=-, ptr_addr=0x7ff025ddf8d0, id=4566354832, nb=3).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=56, file_name=-, ptr_addr=0x7ff025ddf8d0, id=4566354832, nb=3).
 ✗ select_by_intron_size_8
   (in test file gtftk_test.bats, line 996)
     `[ "$result" -eq 9 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7ff314da8d30, id=4537256976, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling add_attr_from_dict
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7ff3159a4530, id=4532385936, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7ff314da8d30, id=4537256976, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0003T001,G0004T002...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=28, file_name=-, ptr_addr=0x7ff310cc4560, id=4536961936, nb=3).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7ff3159a4530, id=4532385936, nb=2).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=28, file_name=-, ptr_addr=0x7ff310cc4560, id=4536961936, nb=3).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=28, file_name=-, ptr_addr=0x7ff310cc4560, id=4536961936, nb=3).
 ✗ select_by_intron_size_9
   (in test file gtftk_test.bats, line 1002)
     `[ "$result" -eq 8 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7ffe955c4020, id=4517936144, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling add_attr_from_dict
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7ffe95b684b0, id=4513065104, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7ffe955c4020, id=4517936144, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0003T001,G0004T002...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=28, file_name=-, ptr_addr=0x7ffe95ab9cf0, id=4517641104, nb=3).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7ffe95b684b0, id=4513065104, nb=2).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=28, file_name=-, ptr_addr=0x7ffe95ab9cf0, id=4517641104, nb=3).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=28, file_name=-, ptr_addr=0x7ffe95ab9cf0, id=4517641104, nb=3).
 ✗ select_by_intron_size_10
   (in test file gtftk_test.bats, line 1008)
     `[ "$result" -eq 8 ]' failed
       |--- 17:08-INFO-select_by_intron_size : Searching for intronic regions.
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7fe645c96a50, id=4519259152, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling 'get_intron'.
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-INFO-select_by_intron_size : Deleting: G0006T001,G0006T002,G0006T001,G0003T001,...
       |--- 17:08-DEBUG-select_by_intron_size : Calling add_attr_from_dict
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=70, file_name=-, ptr_addr=0x7fe6436e4e20, id=4514388112, nb=2).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7fe645c96a50, id=4519259152, nb=1).
       |--- 17:08-DEBUG-select_by_intron_size : Calling extract_data.
       |--- 17:08-DEBUG-select_by_intron_size : Calling select_by_key (key=transcript_id, value=G0001T002,G0001T001...)
       |--- 17:08-DEBUG-select_by_intron_size : Creating a GTF instance.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF created (nb_lines=56, file_name=-, ptr_addr=0x7fe645904140, id=4518964112, nb=3).
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=70, file_name=-, ptr_addr=0x7fe6436e4e20, id=4514388112, nb=2).
       |--- 17:08-DEBUG-select_by_intron_size : Writing a GTF (nb_lines=56, file_name=-, ptr_addr=0x7fe645904140, id=4518964112, nb=3).
       |--- 17:08-INFO-select_by_intron_size : GTF written.
       |--- 17:08-DEBUG_MEM-select_by_intron_size : GTF deleted (nb_lines=56, file_name=-, ptr_addr=0x7fe645904140, id=4518964112, nb=3).
 ✗ select_by_nb_exons_1
   (in test file gtftk_test.bats, line 1015)
     `[ "$result" -eq 8 ]' failed
 ✗ select_by_nb_exons_2
   (in test file gtftk_test.bats, line 1020)
     `[ "$result" -eq 8 ]' failed
 ✗ select_by_nb_exons_3
   (in test file gtftk_test.bats, line 1026)
     `[ "$result" -eq 9 ]' failed
 ✗ select_by_nb_exons_4
   (in test file gtftk_test.bats, line 1032)
     `[ "$result" -eq 3 ]' failed
 ✗ select_by_nb_exons_5
   (in test file gtftk_test.bats, line 1038)
     `[ "$result" -eq 25 ]' failed
 ✗ select_by_nb_exons_6
   (in test file gtftk_test.bats, line 1044)
     `[ "$result" -eq 60 ]' failed
 ✓ select_by_nb_exons_7
 ✗ select_by_nb_exons_8
   (in test file gtftk_test.bats, line 1056)
     `[ "$result" -eq 8 ]' failed
 ✗ select_by_nb_exons_9
   (in test file gtftk_test.bats, line 1062)
     `[ "$result" -eq 4 ]' failed
 ✗ overlapping_1
   (in test file gtftk_test.bats, line 1070)
     `[ "$result" = "G0005T001,G0010T001," ]' failed
       |--- 17:08-INFO-overlapping : Checking chromosome info file.
       |--- 17:08-INFO-overlapping : Using -u 2
       |--- 17:08-INFO-overlapping : Using -d 2
       |--- 17:08-INFO-overlapping : Getting transcript in bed format
       |--- 17:08-INFO-overlapping : Getting tts and 'slopping'.
       |--- 17:08-INFO-overlapping : GTF written.
 ✗ overlapping_2
   (in test file gtftk_test.bats, line 1077)
     `[ "$result" = "G0002T001,G0006T001,G0006T002," ]' failed
       |--- 17:08-INFO-overlapping : Checking chromosome info file.
       |--- 17:08-INFO-overlapping : Using -u 2
       |--- 17:08-INFO-overlapping : Using -d 2
       |--- 17:08-INFO-overlapping : Getting transcript in bed format
       |--- 17:08-INFO-overlapping : Getting promoter and 'slopping'.
       |--- 17:08-INFO-overlapping : GTF written.
 ✓ overlapping_3
 ✗ overlapping_4
   (in test file gtftk_test.bats, line 1089)
     `[ "$result" = "G0002T001,G0006T001,G0006T002," ]' failed
       |--- 17:08-INFO-overlapping : Checking chromosome info file.
       |--- 17:08-INFO-overlapping : Using -u 2
       |--- 17:08-INFO-overlapping : Using -d 2
       |--- 17:08-INFO-overlapping : Getting transcript in bed format
       |--- 17:08-INFO-overlapping : Getting promoter and 'slopping'.
       |--- 17:08-INFO-overlapping : GTF written.
 ✗ overlapping_5
   (in test file gtftk_test.bats, line 1095)
     `[ "$result" = "G0005T001,G0010T001," ]' failed
       |--- 17:08-INFO-overlapping : Checking chromosome info file.
       |--- 17:08-INFO-overlapping : Using -u 2
       |--- 17:08-INFO-overlapping : Using -d 2
       |--- 17:08-INFO-overlapping : Getting transcript in bed format
       |--- 17:08-INFO-overlapping : Getting tts and 'slopping'.
       |--- 17:08-INFO-overlapping : GTF written.
 ✓ overlapping_6
 ✗ overlapping_7
   (in test file gtftk_test.bats, line 1108)
     `[ "$result" -eq 2 ]' failed with status 2
       |--- 17:08-INFO-overlapping : Checking chromosome info file.
       |--- 17:08-INFO-overlapping : Using -u 2
       |--- 17:08-INFO-overlapping : Using -d 2
       |--- 17:08-INFO-overlapping : Getting transcript in bed format
       |--- 17:08-INFO-overlapping : Getting tts and 'slopping'.
       |--- 17:08-INFO-overlapping : GTF written.
       |--- 17:08-ERROR-nb_exons : Could not find any 'transcript' or 'gene'  line in the file. Is this GTF file in ensembl format ? Try to convert it with convert_ensembl command.
   /var/folders/nl/stqvvbcn4pg9v65k8kyjv5yh0000gn/T/bats.57973.src: line 1108: [: : integer expression expected
 ✗ overlapping_8
   (in test file gtftk_test.bats, line 1114)
     `[ "$result" -eq 70 ]' failed
 ✗ overlapping_9
   (in test file gtftk_test.bats, line 1121)
     `[ "$result" = "G0005T001,G0010T001," ]' failed
 ✓ seqid_list_1
 ✓ seqid_list_2
 ✓ control_list_1
 ✓ control_list_2
 ✓ profile_1
 ✗ profile_2
   (in test file gtftk_test.bats, line 1161)
     `result=`gtftk mk_matrix -i mini_real_noov_rnd_tx.gtf -d 5000 -u 5000 -w 200 -c hg38.genome  -l  H3K4me3,H3K79me,H3K36me3 ENCFF742FDS_H3K4me3_K562_sub.bw ENCFF947DVY_H3K79me2_K562_sub.bw ENCFF431HAA_H3K36me3_K562_sub.bw -o mini_real_promoter`' failed with status 139
 ✓ profile_3
 ✓ profile_4
 ✓ profile_5
 ✓ profile_6
 ✓ profile_7
 ✗ profile_8
   (in test file gtftk_test.bats, line 1198)
     `result=`gtftk mk_matrix -i mini_real_noov_rnd_tx.gtf -t transcript  -d 5000 -u 5000 -w 200 -c hg38.genome  -l  H3K4me3,H3K79me,H3K36me3 ENCFF742FDS_H3K4me3_K562_sub.bw ENCFF947DVY_H3K79me2_K562_sub.bw ENCFF431HAA_H3K36me3_K562_sub.bw -o mini_real_tx`' failed with status 139
       |--- 17:09-WARNING-mk_matrix : Encountered regions shorter than bin number.
       |--- 17:09-WARNING-mk_matrix : ENST00000612829 has length : 85
       |--- 17:09-WARNING-mk_matrix : They will be set to NA or --pseudo-count depending on --zero-to-na.
       |--- 17:09-WARNING-mk_matrix : Filter them out please.
 ✓ profile_9
 ✓ profile_10
 ✓ profile_11
 ✓ profile_12
 ✓ profile_13
 ✓ profile_14
 ✓ profile_15
 ✓ profile_16
 ✓ profile_17
 ✓ profile_18
 ✗ profile_19
   (in test file gtftk_test.bats, line 1266)
     `result=`gtftk mk_matrix --bin-around-frac 0.5 -i mini_real_noov_rnd_tx.gtf -t transcript  -d 5000 -u 5000 -w 200 -c hg38.genome  -l  H3K4me3,H3K79me,H3K36me3 ENCFF742FDS_H3K4me3_K562_sub.bw ENCFF947DVY_H3K79me2_K562_sub.bw ENCFF431HAA_H3K36me3_K562_sub.bw -o mini_real_tx_2`' failed with status 139
       |--- 17:10-WARNING-mk_matrix : Encountered regions shorter than bin number.
       |--- 17:10-WARNING-mk_matrix : ENST00000612829 has length : 85
       |--- 17:10-WARNING-mk_matrix : They will be set to NA or --pseudo-count depending on --zero-to-na.
       |--- 17:10-WARNING-mk_matrix : Filter them out please.
 ✓ profile_20
 ✓ profile_21
 ^Cprofile_22      
dputhier commented 6 years ago

Note that commenting the freeing function makes all tests from select_by_key ok. There is something to dig here

$ make bats_cmd CMD=select_by_key
 ✓ select_by_key_1
 ✓ select_by_key_2
 ✓ select_by_key_3
 ✓ select_by_key_4
 ✓ select_by_key_5
 ✓ select_by_key_6
 ✓ select_by_key_7
 ✓ select_by_key_8
 ✓ select_by_key_9
 ✓ select_by_key_10
 ✓ select_by_key_11
 ✓ select_by_key_12
 ✓ select_by_key_13
 ✓ select_by_key_14

14 tests, 0 failures
fafa13 commented 6 years ago

Could you try again ?

On 10/25/2017 05:19 PM, Denis Puthier wrote:

Note that commenting the freeing function makes all tests from select_by_key ok. There is something to dig here

|$ make bats_cmd CMD=select_by_key ✓ select_by_key_1 ✓ select_by_key_2 ✓ select_by_key_3 ✓ select_by_key_4 ✓ select_by_key_5 ✓ select_by_key_6 ✓ select_by_key_7 ✓ select_by_key_8 ✓ select_by_key_9 ✓ select_by_key_10 ✓ select_by_key_11 ✓ select_by_key_12 ✓ select_by_key_13 ✓ select_by_key_14 14 tests, 0 failures |

The results for the ~200 first tests is provided below.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dputhier/libgtftk/issues/49#issuecomment-339366139, or mute the thread https://github.com/notifications/unsubscribe-auth/APjIc02CQ4V2qZqQWr01YP3aV8I93rDOks5sv1GbgaJpZM4QEshR.

-- Fabrice Lopez INSERM U1090/TAGC Campus de Luminy 163 AVENUE DE LUMINY - CASE 928 13288 Marseille Cedex 9 Mail: fabrice.lopez@inserm.fr Tel: 04 91 82 87 24

dputhier commented 6 years ago

Idem. Si je commente le free_gtf_data tout va bien sinon segfault.

fafa13 commented 6 years ago

Apparemment, tu n'a pas intégré les dernières modifs que j'ai faites dans free_gtf_data.c

On 10/25/2017 10:10 PM, Denis Puthier wrote:

Idem. Si je commente le free_gtf_data tout va bien sinon segfault.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dputhier/libgtftk/issues/49#issuecomment-339455726, or mute the thread https://github.com/notifications/unsubscribe-auth/APjIc0-b6VhFjdOC9FTgpfak74eANCZTks5sv5WpgaJpZM4QEshR.

-- Fabrice Lopez INSERM U1090/TAGC Campus de Luminy 163 AVENUE DE LUMINY - CASE 928 13288 Marseille Cedex 9 Mail: fabrice.lopez@inserm.fr Tel: 04 91 82 87 24

fafa13 commented 6 years ago

Plus que 2 erreurs :

splicing_site_1 et add_exon_nb_5

On 10/25/2017 10:10 PM, Denis Puthier wrote:

Idem. Si je commente le free_gtf_data tout va bien sinon segfault.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dputhier/libgtftk/issues/49#issuecomment-339455726, or mute the thread https://github.com/notifications/unsubscribe-auth/APjIc0-b6VhFjdOC9FTgpfak74eANCZTks5sv5WpgaJpZM4QEshR.

-- Fabrice Lopez INSERM U1090/TAGC Campus de Luminy 163 AVENUE DE LUMINY - CASE 928 13288 Marseille Cedex 9 Mail: fabrice.lopez@inserm.fr Tel: 04 91 82 87 24

dputhier commented 6 years ago

Cette version est dispo ?

fafa13 commented 6 years ago

Oui, elle est dispo dans develop_3.3 de libgtftk.

dputhier commented 6 years ago

OK. This one seems to be OSX specific :). Looks better under Linux (pedagogix). Bye

dputhier commented 6 years ago

Got all tests ok on pedagogix. Note that you can specifically run tests of a given command using:

   make bats_cmd CMD=the_name_of_the_command
dputhier commented 6 years ago

Fixed in f52ee3078a34aa5b3417fb82161ffb26e4fd3261